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Abstract 

Compilers  for  ML  and  Haskell  use  intermediate  languages 
that  incorporate  deeply-embedded  assumptions  about  order 
of  evaluation  and  side  effects.  We  propose  an  intermediate 
language  into  which  one  can  compile  both  ML  and  Haskell^ 
thereby  facilitating  the  sharing  of  ideas  and  infrastructure, 
and  supporting  language  developments  that  move  each  lan¬ 
guage  in  the  direction  of  the  other.  Achieving  this  goal  with¬ 
out  compromising  the  ability  to  compile  as  good  code  as  a 
more  direct  route  turned  out  to  be  much  more  subtle  than 
we  expected.  We  address  this  challenge  using  monads  and 
unpointed  types,  identify  two  alternative  language  designs, 
and  explore  the  choices  they  embody. 


1  Introduction 

Functional  programmers  are  typically  split  into  two  camps: 
the  strict  (or  call- by- value)  camp,  and  the  lazy  (or  call-by- 
need)  camp.  As  the  discipline  has  matured,  though,  each 
cstmp  has  come  more  and  more  to  recognise  the  merits  of  the 
other,  and  to  recognise  the  huge  areas  of  common  interest. 
It  is  hard,  these  days,  to  find  anyone  who  believes  that  lazi¬ 
ness  is  never  useful,  or  that  strictness  is  always  bad.  While 
there  sure  still  pervasive  stylistic  differences  between  strict 
and  lazy  programming,  it  is  now  often  possible  to  adopt  lazy 
evaluation  at  particular  places  in  a  strict  language  (Okasaki 
[1996]),  or  strict  evaluation  at  particular  points  in  a  lazy  one 
(for  example,  Haskell's  strictness  annotations  (Peterson  et 
al.  [1997])). 

This  rapprochement  has  not  yet,  however,  propagated  to 
our  implementations.  The  insides  of  an  ML  compiler  look 
pervasively  different  to  those  of  a  Haskell  compiler.  Notably, 
sequencing  and  support  for  side  effects  and  exceptions  are 
usually  implicit  in  an  ML  compiler’s  intermediate  language 
(IL),  but  explicit  (where  they  occur)  in  a  Haskell  compiler 
(Launchbury  &  Peyton  Jones  [1995]).  On  the  other  hand, 
thunk  formation  and  forcing  are  implicit  in  a  Haskell  com¬ 
piler's  intermediate  language,  but  explicit  in  an  ML  com¬ 
piler.  These  pervasive  differences  make  it  impossible  to 
share  code,  and  hard  to  share  results  and  analyses,  between 
the  two  styles. 

To  say  that  ’‘support  for  side  effects  axe  implicit  in  an  ML 
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compiler’s  IL”  (for  example)  is  not  to  say  that  an  ML  com¬ 
piler  will  take  no  notice  of  side  effects:  on  the  contrary,  an 
ML  compiler  might  well  perform  a  global  analysis  that  iden¬ 
tifies  pure  sub-expressions  (though  in  practice  few  do).  How¬ 
ever,  one  might  wonder  whether  the  analysis  would  discover 
all  the  pure  sub-expressions  in  a  Haskell  program  translated 
into  the  IL.  In  the  same  way,  if  an  ML  program  were  trans¬ 
lated  into  a  Haskell  compiler’s  IL,  the  latter  might  not  dis¬ 
cover  ail  the  occasions  in  which  a  function  argument  was 
guaranteed  to  be  already  evaluated.  This  thought  motivates 
the  following  question:  could  we  design  a  common  compiler 
intermediate  Language  (IL)  that  would  serve  equally  well  for 
both  strict  and  lazy  languages  P  The  purpose  of  this  paper  is 
to  expiore  the  design  space  for  just  such  a  language. 

We  restrict  our  attention  to  higher  order,  poiymorphically 
typed  intermediate  languages.  There  is  considerable  interest 
at  the  moment  in  type-directed  compilation  for  polymorphic 
languages,  in  which  type  information  is  maintained  accu¬ 
rately  right  through  compilation  and  even  on  to  run  time 
(Harper  ^  Morrisett  [1995 j;  Shao  Sz  Appel  [1995];  Tardici  ec 
dl.  [1996]).  Hence  we  focus  on  higher  order,  staticcdly  typed 
source  languages,  represented  in  this  paper  by  ML  (Milner 
^  Tofte  [1990])  and  Haskell  (Peterson  et  al.  [1997]). 

At  first  we  expected  the  design  to  be  relatively  straight¬ 
forward,  but  we  discovered  chat  it  was  not.  In  particular, 
making  sure  that  the  IL  has  good  operational  properties  for 
both  strict  and  lazy  languages  turns  out  to  be  rather  subtle. 
Identifying  these  subtleties  is  the  main  contribution  of  the 
paper: 

•  We  employ  monads  to  express  and  delimit  state,  in¬ 
put/output,  and  exceptions  (Section  3).  Using  mon¬ 
ads  in  this  way  is  now  well  known  to  theorists  (Moggi 
[1991])  and  to  language  designers  (Launchbury  Sc  Pey¬ 
ton  Jones  [1995];  Peyton  Jones  Sc  Wadler  [1993]; 
Wadler  [I992a|),  but,  with  one  exception  \  no  compiler 
that  we  know  has  monads  built  into  its  intermediate 
language. 

•  We  employ  unpointed  types  to  express  the  idea  that 
an  expression  cannot  diverge  (Section  3.1).  We  show 
that  the  straightforward  use  of  unpointed  types  does 
not  lead  to  a  good  implementation  (Section  3.6).  This 
leads  us  to  explore  two  distinct  lamguage  designs.  The 
first.  £i,  is  mathematically  simple,  but  cannot  be  com¬ 
piled  well  (Section  3).  An  alternative  design,  C2,  adds 
operational  significance  to  unpointed  types,  by  guar¬ 
anteeing  that  a  variable  of  unpointed  type  is  evaluated 
(Section  4);  this  means  £2  can  be  compiled  well,  but 
weakens  its  theory. 

•  We  identify  an  interaction  between  unpointed  types, 
polymorphism,  and  recursion  in  Ci  (Section  3.5).  In- 
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terestingl}’,  the  problem  turns  out  to  be  more  easily 
solved  in  £2  than  Ci  (Section  4.2). 

None  of  these  ingredients  are  new.  Our  contribution  is  to  ex¬ 
plore  the  interactions  of  mixing  them  together.  We  emerge 
with  the  core  of  a  practical  IL  that  has  something  to  offer 
both  the  strict  and  Izzy  community  in  isolation,  as  well  as 
offering  them  a  common  framework.  Our  long-term  goal  is 
to  establish  an  intermediate  language  that  will  enable  the 
two  communities  to  share  both  ideas  (analyses,  transforma¬ 
tions)  and  systems  (optimisers,  code  generators,  run-time 
systems,  pro&ers,  etc)  more  effectively  than  hitherto. 


2  The  ground  rules 

We  seek  an  intermediate  language  (IL)  vnth  the  following 
properties: 

•  It  must  be  possible  to  translate  both  (core)  ML  and 
Haskell  into  the  IL.  Extensions  that  add  laziness  to 
ML,  or  strictness  to  Haskell,  should  be  readily  incor¬ 
porated.  We  make  no  attempt  to  treat  ML'S  module 
system,  though  that  would  be  a  desirable  extension. 

•  In  order  to  accommodate  ML  and  Haskell  the  IL's 
type  system  must  support  polymorphism.  This  ground 
rule  turns  out  to  have  very  significant,  and  rather 
unfortunate,  impact  upon  our  language  designs  (Sec¬ 
tion  3.5),  but  it  seems  quite  essential.  Nearly  all  exist¬ 
ing  compilers  generate  polymorphic  target  code,  and 
although  researchers  have  experimented  with  compil¬ 
ing  away  polymorphism  by  type  specialisation  (Jones 
[1994]:  Toimach  L  Oliva  [i997]),  problems  with  sepa¬ 
rate  compilation  and  potential  code  explosion  remain 
unresolved. 

•  The  IL  should  be  explicitly  typed  (Harper  &  Mitchell 
[1993]}.  We  have  in  mind  a  \'ariant  of  System  F  (Gi¬ 
rard  [i990]).  with  its  explicit  type  abstractions  and 
applications.  The  expressiveness  of  System  F  really 
is  required.  For  example,  there  are  several  reasons 
for  wanting  polymorphic  arguments  to  functions:  the 
translation  of  Haskell  type  classes  creates  “dictionar¬ 
ies’*  with  pohun Orphic  components:  we  would  like  to  be 
able  to  simulate  modules  using  records  (Jones  [1996]); 
rank- 2  polymorphism  is  required  to  express  encap¬ 
sulated  state  (Launchbury  k.  Peyton  Jones  [1995]); 
and  data-structure  fusion  (Gill,  Launchbuiy  L  Pey¬ 
ton  Jones  [1993]). 

IL  programs  can  readily  be  type-checked,  but  there 
is  no  requirement  that  one  could  infer  types  from  a 
t3T>e-erased  IL  program. 

•  The  IL  should  have  a  single  well-defined  semantics.  On 
the  face  of  it,  compilers  for  both  strict  and  Inxy  lan¬ 
guages  already  use  a  common  language,  namely  the 
lambda  calculus.  But  this  similarity  is  only  at  the 
level  of  syntax;  the  semantics  of  the  two  calculi  differ 
considerably.  In  particular,  the  code  generator  from 
a  strict-language  compiler  would  be  completel}^  unus¬ 
able  in  a  lazy-language  compiler,  and  vice  versa.  Our 
goal  is  to  have  a  single,  neutral,  semantics,  and  hence 
a  single  optimiser  and  code  generator. 

•  ML  (or  Haskell)  programs  thus  compiled  should  be 
as  efficient  as  those  compiled  by  a  good  ML  (resp. 


Haskell)  compiler.  In  other  words,  compiling  through 
the  common  IL  should  not  impose  any  unavoidable  effi- 
ciencj-  penalty,  either  by  way  of  loss  of  transformations 
(especially  when  starting  from  Haskell)  or  by  way  of 
a  less  efficient  basic  evaluation  model  (especially  when 
starting  from  ML).  Indeed,  our  hope  is  that  we  noiay 
ultimately  be  able  to  generate  better  code  through  this 
new  route. 


3  £1,  a  totally  explicit  language 

It  is  clear  that  the  IL  must  be  explicit  about  things  that  are 
implicit  in  “traditional”  compiler  ILs.  Where  are  these  im¬ 
plicit  aspects  of  a  “traditional”  IL  currently  made  explicit? 
Answer:  in  the  denotational  semantics  of  the  IL.  For  ex¬ 
ample,  the  denotational  semantics  of  a  call-by- value  lambda 
calculus  looks  something  like  this^ 

S[ei  e2]p  =  {S[ei]p)  b,  if  a  =  b± 
if  a  =  ± 

where  a  =  5[e2lp 

Here,  the  two  cases  in  the  right-hand  side  deal  with  the  pos¬ 
sible  non-termination  of  the  argument.  What  is  implicit  in 
the  IL  -  the  evaluation  of  the  argument,  in  this  case  -  be¬ 
comes  explicit  in  the  semantics.  An  obvious  suggestion  is 
therefore  to  make  the  IL  reflect  the  denotational  semantics 
of  the  source  language  directly,  so  that  everything  is  explicit 
in  the  IL.  and  nothing  remains  to  be  explicated  by  the  se¬ 
mantics.  This  is  our  first  design,  £1. 

Figure  1  gives  the  syntax  and  type  rules  for  £1.  We  note 
the  following  features: 

•  As  a  compromise  in  the  interest  of  brevity  all  our 
formal  material  describes  only  a  simply-typed  calcu¬ 
lus.  although  supporting  polymorphism  is  one  of  our 
ground  rules.  The  extensions  to  add  pohmorphism, 
complete  with  explicit  type  abstractions  and  applica¬ 
tions  in  the  term  language,  are  fairly  standard  (Harper 
L  Mitchell  [1993];  Peyton  Jones  [1996];  Tarditi  et  al. 
[1996]).  However,  polymorphism  adds  some  extra  com¬ 
plications  (Section  3.5,  3.6). 

•  We  omit  recursive  data  ty'pes,  constructors,  and  case 
expressions  for  the  sake  of  simplicity,  being  content 
vdth  pairs  and  selectors. 

•  let  is  simply  very  convenient  S3mtactic  sugar.  It  is  not 
there  to  introduce  polymorphism,  even  in  the  polymor¬ 
phic  extension  of  the  language;  explicit  typing  removes 
this  motivation  for  let. 

•  letrec  introduces  recursion.  Though  we  only  give  it 
one  binding  here,  our  intention  is  that  it  should  ac¬ 
commodate  multiple  bindings.  We  use  it  rather  than 
a  constant  fix  because  the  latter  requires  heavy  en¬ 
coding  for  mutual  recursion  that  is  not  reflected  in 
an  implementation.  We  discuss  recursion  in  detail  in 
Section  3.5,  including  the  unspecified  side  condition 
mentioned  in  the  rule. 

•  Following  Moggi  [1991],  we  express  “computational  ef¬ 
fects”  —  such  as  non-termination,  assignment,  excep¬ 
tions,  and  input/output  —  in  monadic  form.  The  type 

^We  use  the  following  standard  notation.  If  T  is  a  complete  partial 
order  (CPO),  then  the  CPO  Tj.,  pronounced  “T  lifted*",  is  defined 
thus:  {oj.  }  a  €  T}  U  {±),  with  the  obvious  ordering. 
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Figure  1:  Syntax  and  type  rules  for  £i 


M  r  is  the  t3rpe  of  Af -computations  returning  a  value 
of  type  r,  where  M  is  drawn  from  a  fixed  family  of 
monads.  The  syntactic  forms  letM  and  retw  are 
the  bind  and  unit  combinators  of  the  monad  M.  The 
only  two  monads  we  consider  for  now  are  the  lifting 
monad,  Lift,  and  the  combination  of  lifting  with  the 
state  transformer  monad,  ST.  It  is  a  straightforward 
extension  to  include  the  monads  of  exceptions  and  in¬ 
put/output  as  well. 

This  use  of  monads  appears  to  contradict  our  goal  that 
Cl  should  have  a  trivial  semantics.  We  discuss  the 
reasons  for  this  decision  in  Section  3.4. 

Figure  2  gives  the  semantics  of  The  semantic  function 

T  gives  the  meaning  of  types.  If  it  looks  somewhat  boring, 


Figure  2:  Semantics  of  £i 


that  is  the  point!  The  function  arrow  in  £i  is  interpreted  by 
function  arrow  in  the  underlying  category  of  complete  par¬ 
tial  orders  (C'PO),  product  is  interpreted  by  (categorical,  i.e. 
un-lifted)  product,  and  integers  are  interpreted  by  the  inte¬ 
gers.  (If  C\  were  expanded  to  have  sum  types,  they  would 
be  interpreted  by  (categorical,  separated)  sums.)  Lastly, 
each  monad  is  specified  by  an  interpretation.  The  monard 
of  lifting  is  interpreted  by  lifting,  while  a  state  transformer 
is  interpreted  by  a  function  from  the  current  “state”  to  a 
result  and  the  new  state.  The  “state”  is  a  finite  mapping 
from  location  identifiers  (modeled  by  the  natural  numbers, 
jV*)  to  their  contents. 

The  semantic  function  E  gives  the  meaning  of  expressions. 
Again,  many  of  its  equations  are  rather  dull:  application 
is  interpreted  by  application  in  the  underlying  category, 
lambda  abstraction  by  functional  abstraction,  and  so  on. 
The  semantics  of  the  two  monads  is  given  by  their  bind  and 
unit  functions.  FVom  the  semantics  one  can  prove  that  both 
0  and  t;  are  valid  with  respect  to  the  semantics,  and  that 
monadic  expressions  admit  a  number  of  standard  transfor- 
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letM  x<-retM  e  in  6  = 
(M2)  letM  X  <-  (letM  y  <-  d  in  62)  in  = 
(M3)  letM  x<- (let  y  =  ei  in  62)  in  6  = 
(M4)  letM  x<-(letrec  y*ei  in  62)  in  6  = 
(M5)  letAf  x<-e  in  retM  x  = 

(jV/6)  let  x  =  e  in  retM  b  = 


let  x:r«e  in  6 

letM  y<“ei  in  (letM  x<“e2  in  6)  y  i  fv{b) 
let  y«ei  in  (letM  x<-e2  in  6)  V  i  /"^{b) 
letrec  y^ei  in  (letM  x<“e2  in  6)  y  i  fv{b) 
e 

retM  (let  x  *  e  in  6) 


Figure  3:  Monad  transformations 


mations,  given  in  Figure  3. 


3.1  Termination  and  non-termination 

As  we  have  mentioned,  the  interpretation  of  a  type  in  £1 
is  a  complete  partial  order  (CPO).  However,  the  interpreta¬ 
tion  of  a  type  is  not  necessarily  a  pointed  CPO;  that  is,  the 
CPO  does  not  necessarily  contain  a  bottom  element.  For 
example,  the  data  type  of  integers,  Int,  is  interpreted  by 
the  unpointed  CPO  of  integers,  £.  That  is,  if  an  expression 
has  type  Int,  then  it  denotes  an  integer,  and  cannot  denote 
a  non^terminating  computation.  How,  then,  do  we  express 
the  zype  of  possibly-diverging  integer-\^ued  computations? 
As  we  have  seen.  '£i  has  an  explicit  type  constructor  for 
each  monadic  (i.e.  computation)  type,  of  wmich  lifting  is 
one.  To  express  the  type  of  a  possibly-diverging  integer  we 
use  the  lifting  monad.  A  possibly-diverging  integer-valued 
expression  therefore  has  type  Lift  Int. 

So  £i's  t3^pe  system  can  distinguish  surely-terminating  ex¬ 
pressions  from  possibly-diverging  ones.  The  main  reason 
for  making  this  distinction  in  the  type  system  is  so  that  we 
can  express  the  idea  that  a  function  takes  an  evaluated  argu- 
ment  The  £1  lambda  abstraction  \x:Int.e  expresses  that 
X  cannot  possibly  be  -L.  and  so  is  a  suitable  translation  of  a 
lambda  abstraction  from  a  call- by- value  language.  On  the 
other  hand  \x;Lift  Int.e  expresses  that  r  might  perhaps 
be  -L,  which  fits  a  call-by-name  or  call-by-need  language. 

A  second  motivation  for  distinguishing  pointed  t3rpes  from 
unpointed  ones  is  that  some  useful  program  transforma¬ 
tions  that  are  not  valid  in  general,  hold  unconditionally 
w'hen  one  has  more  control  over  pointedness.  Several  re¬ 
searchers  have  explored  languages  that  employ  a  distinc¬ 
tion  between  pointed  and  unpointed  types  (Howard  [1996]; 
Launchbur}^  L  Paterson  [1996]),  and  others  have  explored 
pure  languages  without  pointed  types  altogether  (Cockett 
&L  Fukushima  [1992];  Hagino  [1987];  Turner  [1995]).  The 
presence  of  unpointed  types  has  consequences  for  recursion, 
as  we  discuss  in  Section  3.5. 


Types  5,  T  : 
Haskell  only 

Terms  M,  N 

Haskell  only 

Integers  i 

:=:  Int|()lS*r|5-^T|RefS 

1  ST  5 

:=  x\i\M  N\Xx:T,M\M  +  N 

1  letrec  x:T  =  M  in  iV 

1  let  x:T  =  M  in  TV 

1  pair  M  TV  |  fst  M  ]  snd  M 

1  new  M  1  rd  M  1  wr  M  TV 

1  letsT  x:T^M  in  TV  |  retsT  M 

:=  0|1|2|... 

“ML”  constants 

new 

Va.a  — >  Ref  a 

rd 

Va.Ref  Q  — ^  a 

WT 

Va.Ref  a  Q;  0 

“Haskeir  constants 

new 

Va.Q  — ^  ST  (Ref  a) 

rd 

Va.Ref  a  — ST  a 

wr 

Va.Ref  a  a  ST  () 

Figure  4:  Syntax  of  5 


This  use  of  monads  is  well  known.  Moggi  pioneered  the 
idea  of  using  monads  to  encapsulate  computations  (Moggi 
[1991];  Wadier  (1992a]).  The  lazy  functional  programming 
community  has  been  using  monads  very  effectively  to  isolate 
and  encapsulate  stateful  computations  and  input/output 
within  pure,  lazy  programs  (Launchbury  k.  Peyton  Jones 
[1995];  Peyton  Jones,  Gordon  k  Finne  [1996];  P^on  Jones 
k  Wadier  [1993];  Wadier  [1992b]).  Nevertheless,  there  are 
surprisingly  subtle  design  choices  to  make,  as  we  discuss  in 
Section  3.4. 


3.2  Stateful  computations 

In  a  similar  way,  we  use  the  ST  monad  to  express  in  the  type 
system  the  distinction  between  pure  and  stateful  computa¬ 
tions.  For  example,  an  expression  of  type  Lift  Int  denotes 
a  pure  (side-effect  free),  albeit  possibly-divergent,  computa¬ 
tion;  on  the  other  hand,  and  expression  of  type  ST  Int  de¬ 
notes  a  computation  that  might  diverge^,  or  might  perform 
some  side  effects  on  a  global  state  and  deliver  an  integer. 
Further  monads  can  readily  be  added  to  model  exceptions, 
or  continuations,  or  input /output. 

^ST  combines  lifting  with  state.  It  would  be  possible  to  separate 
the  two,  as  we  discuss  in  Section  7. 


3.3  Trimslating  ML  and  Haskell  into  £i 

Before  discussing  its  design  any  further,  we  first  emphasise 
£i*s  role  as  a  target  for  both  strict,  stateful,  and  pure,  lazy 
languages  by  giving  translations  from  both  into  £1.  Figure  4 
gives  the  syntax  of  a  tiny  generic  source  language,  5.  We 
regard  5  as  a  prototype  for  either  ML  or  Haskell,  by  giving 
it  a  strict  or  \zzy  interpretation  respectively.  In  either  case, 
S  is  assumed  to  have  been  explicitly  annotated  with  type 
information  by  a  type  inference  pass. 

The  constants  pair,fst,snd  have  the  same  (obvious)  5 
types  in  both  interpretations.  The  constants  new,rd,wr 
create,  read,  and  write  a  mutable  variable.  Unlike  pair. 
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^V((Intj  =  lat 
^V([S.T3  =  GW(53„Vt{T]) 

M[{)]  =  0 

M[S  T]  =  M[S]  ->  ST  .VtIT] 

>([Ref5]  =Ref  {M[S]) 

M[x\  =  retsx  X 
M{^  =  ret  ST  i 

MIM  iV]  =  letsT  f<-M{M]  in 
let  ST  |iV]  in 

/  a 

M[Ax:r..\f]  =retsT  (\x:.M[T] ..MiM]) 
^WPet  x:T  =  Af  in  iV] 

=  letsT  x\M{r\  <-M{M\  in 
vVlpetrec  f:S^T^Xx:  S.M  in  :V] 

=  letrec  fiM[S  in  M[N] 

A^[pair  ilf  iV]  =  letsT  a<^  M[M]  in 
letsT  &<'"w'W{iV3  in 
retsT  (a,&) 

. . .  and  similarly  wr.  4- 
M[fst  M\  =  let  ST  a<-yW[M3  in 
ret  ST  fst  a 

. . .  and  similarly  snd,  new,  rd 


T^pnt]  =  Lift  Int 

n{s  *  T]  =  Lift  (Hfs] 

HJO]  =  Lift  0 

.  nis  T]  =  ->  n{T\ 

?{IST  T]  =  ST  (HfT]) 

?{lRef  S]  =  Lift  (Ref  [UlS])) 

=  X 

=  retT^ift  i 

HIM  iVJ  =  n[M\  KIN] 
n{\x:T.M\  =  \xxn[T].n[M\ 

T^pet  x:T  =  M  In  .V]  =  let  xr'HlTl  =  HfiV/j  in  -HliV]] 
Hfletrec  xrT  =  A/  in  N] 

=  letrec  x:7i{T]-7i{M]  in 
?£[pair  M  iV]  =  ret^ift  (H[iW|  .HjiV]) 
niM  +  N]  =  letLift  a<-n[M]  in 

b<-'HlN]  in 
ret^ift  ^ 

T^Ifst  M]  =  let  Lift  a  <-  HliV/JI  in  fst  a 
. . .  simiiariv  snd 

nlwT  M  N]  =  letsT  a<-iiftToST  n[M}  in 
wr  a 

. . .  similarly  new,  rd 
[let  ST  x:T^M  in  iV| 

=  letsT  r:7{lT]<-'«[M3  in  K|iV3 

nlretsT  M]  =  retsT  UlM] 


Figure  5:  Translations  of  “ML”  and  “Haskell”  into  Ci 

their  types  differ  in  the  two  interpretations,  as  Figure  4 
shows.  In  the  lazy  interpretation  their  types  explicitly  in¬ 
volve  the  source-language  ST  monad,  and  S  also  includes 
letsT  and  retsx,  the  unit  and  bind  operations  for  ST.  Mod¬ 
ulo  syntax,  this  is  precisely  how  Haskell  expresses  stateful 
computation  (Launchbury  Peyton  Jones  [1995]). 

Then  Figure  5  gives  two  translations  of  S  into  Ci: 


•  The  “ML”  translation,  .VC*,  gives  the  source  language 
a  stateful,  strict,  semantics.  The  result  of  a  term  trans¬ 
lated  by  M  is  a  computation  in  the  ST  monad,  and 
functions  also  return  computations  in  ST.  That  is,  if 
the  ML  type  system  considers  chat  F  H  e  ;  r,  then 
.VI [r]  H  M[e]  :  ST  M[rl 

The  rule  for  application  uses  letsT  to  evaluate  both 
the  function  and  its  argument,  and  to  sequence  any 
state  changes  they  contain,  before  applying  the  func¬ 
tion  to  the  argument.  In  expressions  produced  by 
the  M.  translation,  each  variable  is  bound  to  a  non- 
monadic  type;  that  is,  any  effects  (state  or  non- 
termination)  are  performed  before  binding  the  vari¬ 
able.  When  a  variable,  lambda,  or  pair  is  translated 
we  simply  return  the  value  using  retgT*  Lastly,  a  re¬ 
cursive  ML  declaration  can  only  bind  a  function:  hence 
the  rule  for  letrec. 

•  The  “Haskell”  translation,  gives  the  source  lan¬ 
guage  (minus  the  state-changing  operations)  a  pure, 
non-strict  semantics.  A  key  difference  from  the  ML 
translation  is  Chat  the  Haskell  cranslation  of  data 
types,  such  as  integers,  pairs,  and  lists,  are  lifted,  be¬ 
cause  Haskell  allows  values  of  these  cypes  to  be  recur¬ 
sively  denned.  Unlike  the  .VIL  translation,  the  transla¬ 
tion  of  Haskell’s  function  zype  does  not  need  to  have 
an  explicit  Lift  on  the  co domain.  Nor  does  the  trans¬ 
lation  7i  necessarily  return  a  Lift  computation:  if  the 
Haskell  tvpe  svstem  concludes  that  f  ^  e  :  r  then 

nT]'^ne]:nrl 

Ti  translates  Haskell’s  ST-monad  computations  di¬ 
rectly  into  Cl's  ST  monad,  just  as  you  would  hope^. 
The  only  tiresome  point  is  that  the  first  argument  of 
wr  has  source-language  type  Ref  r,  and  hence  has 
Cl  type  Lift  (Ref  'K[r]).  It  must  therefore  be  lifted 
into  the  ST  monad  using  liftToST  so  that  it  can  be 
evaluated  in  the  ST  monad. 

It  is  interesting  to  compare  the  two  type  translations.  M 
uses  exactly  the  cail-by- value  translation  of  Wadler  [I992ai, 
with  the  computational  effect  at  the  end  of  the  function 
arrow.  On  the  ocher  hand  71  does  not  use  Wadler' s  call-by¬ 
name  translation,  as  one  might  otherwise  expect.  Indeed, 
there  is  no  monadic  effect  in  the  translation  oi  function  types 
at  all;  instead  the  Lift  monad  shows  up  in  the  translation 
of  data  types. 

This  translation  of  Haskell  function  types  assumes  that 
\x.bot  and  bot,  where  hot  has  value  i-,  denote  the  same 
value  in  Haskell.  Recent  changes  to  Haskell  are  likely  to  al¬ 
low  these  values  to  be  distinguished,  forcing  a  lifting  of  func¬ 
tion  types,  and  hence  a  more  gruesome  encoding  of  function 
application. 


3.4  Why  not  encode  the  monads? 

We  have  said  that  Ci  is  meant  to  make  everything  explicit, 
so  that  there  is  nothing  to  be  said  when  giving  its  semantics. 
In  apparent  contradiction,  we  made  the  semantics  of  the 
mona^  implicit  —  that  is,  explained  only  by  the  semantics 
of  £i.  Why,  for  example,  did  we  not  make  the  ST  monad 

'*The  translation  given  here  introduces  quite  a  few  ‘^administrative 
redexes”;  a  slightly  more  complex  translation  can  avoid  them  (Sabry 
&  Wadler  [1996)). 

*We  do  not  treat  the  runST  encapsulator  of  Launchbury  &  Pey¬ 
ton  Jones  [1995)  here,  but  it  is  easy  to  do  so. 
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explicit  by  representing  a  value  of  type  ST  r  as  a  state- 
transforming  function  in  £i,  and  representing  letgx  and 
ret  ST  using  the  other  L\  forms?  For  example,  instead  of 
the  term 

letsj  X  <-  e  in  6 
we  could  write  the  L\  term 

bindST  e  (\x.6) 

where  bindST  is  defined  (directly  in  £i)  as  follows 

bindST  *=  \m  k  s.let  p-m  s  in  A:  (fst  p)  (snd  p) 

Here,  the  state  passing  is  made  explicit,  but  the  state  itself 
is  still  abstract,  supporting  the  new,  read  and  write  oper¬ 
ations.  This  is  the  approach  advocated  by  Laxmchbury  ic 
Peyton  Jones  [1995,  Section  9].  It  has  the  notable  advantage 
that  we  can  simplify  £i  by  getting  rid  of  let  a/  and  retAf 
entirely. 

We  do  not  adopt  that  approach  here,  for  three  reasons; 


I-  (Lift  r)  pomted 

h  (ST  r)  pointed 

t-  Ti  pointed 
h  (r2  “>  Ti)  pomted 

h  Ti  pointed  H  r2  pointed 
^  (ti»T2)  pointed 


Figure  6;  Rules  for  pointed  types 

with  recursion.  The  rule  (REC)  in  Figure  1  suggests  that  a 
letrec  can  be  constructed  at  any  type.  But  that  is  not  so. 
Consider 

letrec  x :  Int  «  . . .  x . . .  in  ... 


•  Encoding  the  monad  in  ptirely  functional  terms  is  a 
reasonable  way  of  giving  its  semantics,  but  it  may  not 
be  a  reasonable  way  of  giving  its  implementation.  Con¬ 
sider,  for  example,  the  monad  of  exceptions  in  a  strict 
language.  The  functional  encoding  would  perform  a 
conditional  test  whenever  a  possibly-exceptional  value 
was  bound:  but  the  expected  implementation  is  stack- 
based  with  no  tests.  Instead,  a  whole  chunk  of  stack 
is  popped  when  an  exception  is  raised.  Keeping  the 
monad  explicit  in  Ci  allows  the  code  generator  to  gen¬ 
erate  efficient  code. 

•  Even  when*  an  efficient  code- generation  strateg}'  does 
exist,  its  correctness  may  be  fragile.  For  excim- 
ple,  Launchbury  Peyton  Jones  [1995]  describes  an 
update-in-place  implementation  of  the  primitive  op¬ 
erations  (read  and  wite)  in  the  state  monad.  How¬ 
ever.  that  implementation  is  only  correct  if  the  state 
i.*^  single-threaded.  Thai  is  certainly  the  case  in  the 
terms  produced  by  M,  but  it  might  not  remain  the 
case  after  performing  L\  transformations.  For  exam¬ 
ple,  a  /3-expansion  might  duplicate  the  state. 

It  may  be  possible  to  preserve  the  single-threadedness 
of  the  state  by  limiting  the  transformations  performed 
on  the  Cl  program.  (For  example,  we  believe  that 
using  only  transformations  that  are  correct  in  a  call  by 
need  calculus  is  sufficient  (Sabry  [1997]).)  Even  where 
this  is  true,  it  creates  a  complicated  proof  obligation. 

•  There  may  be  useful  transformations  available  that  are 
specific  to  a  particular  monad  (for  example,  swapping 
the  order  of  non-interfering  assignments),  but  which 
become  inaccessible,  or  hard  to  spot,  when  expressed 
in  a  purely-functional  encoding  of  the  monad. 


Such  a  recursive  definition  is  plainly  nonsense,  because  Int 
is  an  unpointed  type  and  has  no  bottom  element,  so  there 
might  be  no  solution,  or  many  solutions,  to  the  recursive 
definition.  We  can  only  do  recursion  over  pointed  CPOsI® 

How.  then,  can  we  make  sense  of  recursion?  One  solution 
is  to  link  recursion  to  the  Lift  monad,  since  Lift  adds  a 
bottom  to  its  argument  domain: 


{RECa) 


r.  X  :  Lift  T  ei  :  Lift  r  F,  x  :  Lift  r  h  ’  p 
r  H  letrec  x:r  =  ei  in  62  :  p 


This  solution  is  not  very  satisfactoiy.  For  a  start,  it  cannot 
type: 

letrec  f  =  \x.  . . .  in  ... 

because  the  type  of  a  lambda  abstraction  has  the  form 
T  p,  not  Lift  r,  and  lifting  all  functions  raises  the  spec¬ 
tre  of  having  to  force  the  definition  on  each  recursive  call. 
Nor  can  it  t}’pe  recursive  definitions  of  ST  computations. 
Furthermore,  this  loss  of  expressiveness  is  completely  un- 
necessar}',  since  a  function  type  whose  result  type  is  pointed 
is  itself  pointed:  and  any  ST  computation  is  pointed.  The 
right  solution  is  to  fix  (REC)  by  adding  a  side  condition  that 
r  must  be  pointed: 


{RECb) 


r,x  :  r  h  61  :  r 
r,x  :  r  h  62  :  p 

_ h  r  pomted _ 

r  H  letrec  x:r®  6}  in  62  ;  p 


Figure  6  gives  rules  for  determining  when  a  type  is  pointed. 
Unfortunately,  the  extension  to  a  polymorphic  type  system 
is  problematic:  is  the  type  a  pointed  or  not?  There  are  three 
possible  choices: 


We  find  these  reasons  compelling.  On  the  other  hand,  we 
were  concerned  that  by  not  translating  the  monadic  code 
into  a  core  of  £i  we  might  lose  valuable  transformations.  So 
far,  however,  we  have  found  no  transformation  that  cannot 
be  expressed  in  the  monadic  version  of  jCi,  providing  the 
standard  monad  laws  are  implemented  (Figure  3). 


3.5  Recursion  in  C\ 

One  consequence  of  our  decision  to  allow  a  type  to  be  mod¬ 
eled  by  an  unpointed  CPO  is  that  we  have  to  take  care 


•  We  could  decide  that  type  variables  can  only  range 
over  pointed  tjrpes.  This  is  precisely  the  restriction 
proposed  by  Peyton  Jones  &  Laxmchbury  [1991],  but 
it  is  unacceptable  in  our  IL  because  we  expect  (the 
translations  of)  most  ML  data  types  to  be  unpointed. 
For  example,  an  ordinarj^  non-recursive  polymorphic 
function  such  as  the  identity  function  could  not  be 
applied  to  both  3  and  retLiit  3:  because  one  has  a 
lifted  type  and  one  does  not. 

^There  is  a  substantial  literature  on  the  categorical  treatment  of 
recursion  (for  example.  Pitts  [1996]),  but  the  discussion  of  this  section 
focuses  on  the  specific  setting  of  CVO. 
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•  We  could  allow  type  variables  to  range  over  all  types, 
but  prohibit  recursion  at  a  type  variable.  This  would 
irritatingly  reject  recursive  functions  whose  result  type 
is  a  type  variable,  such  as  the  function  nth  that  selects 
the  n’th  element  from  a  list. 

nth  :  Va.Int  ->  (List  a)  ->  a 

•  .Aitematively,  we  could  employ  qualified  universal 
quantification,  where  type  variables  at  which  fixpoints 
are  taken  are  explicitly  qualified; 

nth  :  Va  €  Pointed  .Int  ->  (List  a)  ->  ct 
Launchbury  Sc  Paterson  [1996)  elaborate  on  this  idea. 

Since  the  first  two  choices  are  untenable,  we  conclude  that 
adding  polymorphism  to  a  language  with  both  recursion  and 
unpointed  types,  requires  the  use  of  qualified  universal  quan¬ 
tification. 


3*6  Controlling  evaluation  in  Ci 

While  Ci  seems  to  be  quite  suitable  from  a  theoretical  point 
of  view,  it  suffers  from  a  serious  practical  drawback:  Ci  is 
vague  about  the  timing  and  degree  of  evaluation.  Consider 
the  Ci  expression: 

let  2  :  r»e  in  f  x 

What  code  should  the  code  generator  produce  for  such  an 
expression? 

•  -A.n  ML  compiler  writer  would  probably  expect  the 
code  to  evaluate  the  right-hand  side  of  the  let,  and 
then  call  f  passing  the  value  thus  computed.  But  this 
eager  strategy  is  incorrect  in  general  if  s  diverges,  and 
f  does  not  evaluate  its  argument,  as  a  quick  glance  at 
Figure  2  will  confirm. 

•  A  safe  strategy  is  to  build  a  thunk  (suspension)  for 
the  right-hand  side,  bind  x  to  this  thunk,  and  call  f 
passing  the  thunk  to  it.  That  is  precisely  what  the 
code  generator  for  a  lazy  language  would  do. 

Now  suppose  that  we  are  compiling  code  for  f,  and  that 
f  has  type  Int  ->  Int.  The  major  motivation  for  distin¬ 
guishing  Int  from  Lift  Int  was  to  allow  the  compiler  to 
treat  values  of  type  Int  as  certainly-evaluated,  just  as  a 
strict-language  compiler  would  assume  (Section  3.1).  It  is 
unacceptable  for  f  to  test  whether  its  argument  is  evaluated; 
such  a  choice  would  guarantee  that  no  ML  compiler  would 
use  this  intermediate  language!  .A.las,  the  safe  strategy  for 
preparing  the  f  *s  argument  does  indeed  pass  an  unevaluated 
thunk,  so  f  must  be  prepared  for  this  eventuality. 

Can  we  instead  use  a  hybrid  strategy? 

•  A  hybrid  strategy  for  compiling  let  expressions  might 
use  the  type  of  the  bound  variable  to  decide  what  to 
do:  for  types  whose  values  are  sure  to  converge  (such 
as  Int)  it  can  evaluate  the  right-hand  side  eagerly,  oth¬ 
erwise  it  can  build  a  thunk.  This  strategy  works  for 
a  simply- typed  language  but  fails  (again!)  when  we 
introduce  polymorphism.  What  is  the  code  generator 
to  do  with  a  let  that  binds  a  value  of  type  a?  Either 
the  instantiating  type  must  be  passed  as  an  argument, 
or  we  must  have  two  versions  of  the  code,  one  for  ter¬ 
minating  types  and  one  for  possibly-diverging  ones. 


We  regard  these  complications  as  a  very  serious  (and  far 
from  obvious)  objection  co  using  Ci  for  operational  pur¬ 
poses. 

3.7  Summary 

We  expected  it  to  be  a  routine  matter  to  translate  both 
Haskell  and  ML  into  a  common  Izuiguage  built  directly  on 
top  of  the  standard  mathematics  for  programming-language 
semantics.  To  our  surprise  it  was  not,  as  Sections  3. 5-3. 6 
describe. 

Cl  may  still  be  quite  useful  as  a  kernel  language  for  rea¬ 
soning  about  programs.  However,  as  Section  3.6  has  shown, 
it  is  unsuitable  as  a  compiler  intermediate  language.  Thus 
motivated,  we  now  turn  our  attention  to  a  second  design 
that  is  more  suitable  as  an  IL. 


4  £2?  a  language  of  partial  functions 

Our  second  design  starts  from  the  problem  we  described  in 
Section  3.6.  Operationally,  it  is  essential  co  be  able  co  con¬ 
trol  exactly  when  evaluation  cakes  place,  so  chat  che  recipi¬ 
ent  of  a  value  knows  for  sure  whether  or  not  it  is  evaluated. 

Since  we  want  co  control  what  evaluation  is  done  when,  che 
obvious  thing  co  do  is  co  make  let  (and.  of  course,  function 
application)  eager.  That  is,  co  evaluate  let  in  6 

one  evaluaces  e,  binds  it  to  x,  and  chen  evaluates  6.  (We 
use  che  operational  term  "eager'’ ,  rather  chan  che  semantic 
term  "strict'’  because  the  latter  does  not  mean  anything  if 
the  type  of  e  has  no  bottom  element.)  How.  chen,  are  we  co 
translate  che  lets  and  function  applications  of  a  lazy  lan¬ 
guage?  There  is  a  standard  way  to  do  so.  namely  by  making 
the  construction  and  forcing  of  thunks  explicit  (Friedman  Sc 
Wise  ;1976j).  This  is  what  we  do  in  £2- 

Figure  T  gives  the  syntax  and  extra  type  rules  for  There 
is  now  oniv  one  monad,  ST;  the  Lift  monad  is  now  implicit 
in  che  semantics  of  C2  so  chat  let  and  function  application 
can  be  eager.  There  is  a  new  syntactic  form,  <e>,  Chat  sus¬ 
pends  che  evaluation  of  e,  and  a  new  constant,  force,  chat 
forces  the  suspension  returned  by  its  argument.  There  is 
one  new  type,  <p>,  which  is  the  type  of  <e>  if  e  has  type  p. 
The  two  new  type  rules,  {DELAY)  and  (FORCE)  are  just 
as  you  would  expect. 

Another  new  feature  is  that  types  are  divided  into  value 
types,  r,  and  computation  types,  p.  Intuitively,  an  expression 
has  a  computation  type,  while  a  variable  is  always  bound  co 
a  value  type,  .Another  way  co  say  this  is  that  the  typing 
judgement  now  has  the  form 

{zi  :n,...,Xn  :rn}l-e:  p 

The  type  rules  of  Figure  1  apply  unchanged,  because  we 
carefully  used  r  and  p  in  the  right  places,  although  they  were 
synonymous  in  -Ci.  Function  arguments  and  che  right-hand 
sides  of  let(rec)  expressions  all  have  value  types,  and  are 
evaluated  eagerly.  This  separation  of  value  types  from  com¬ 
putation  types  neatly  finesses  che  awkward  question  of  what 
it  means  to  "evaluate'*  an  argument  computation  without 
also  "performing**  it,  which  caused  us  some  heart-searching 
in  earlier  un-stratified  versions  of  C2.  For  example,  the  ex¬ 
pression  (f  (read  r))  is  ill-typed,  and  hence  we  do  not 
have  to  evaluate  (read  r)  without  also  performing  its  state 
changes.  Indeed,  expressions  of  type  ST  r  can  only  occur  as 


Figure  7;  Extra  syntax  and  type  rules  for  £? 


Figure  9:  Translations  of  “ML”  and  “HaskeU"  into  £2 


the  right  hand  side  of  a  letsr,  tlie  body  of  a  function,  or 
as  the  value  of  the  whole  program.  Finally,  when  polymor¬ 
phism  is  introduced,  type  variables  range  over  value  types 
only. 

Figure  8  gives  the  semantics  of  £2  iu  full.  The  crucial  point 


is  that  £2’s  function  type  arrow  is  now  interpreted  as  the 
CPO  of  partial  functions,  denoted  and  the  semmtic 

evaluation  function  £  takes  an  expression  to  a  partial  func¬ 
tion  from  environments  to  values.  Many  of  the  equations 
are  defined  conditionally.  For  example,  the  equation  for 
S[ei  e2]p  says  that  if  both  £:|ei}p  and  £{621^  are  detoed 
then  the  result  is  just  the  application  of  those  two  v^ues; 
otherwise  there  is  no  equation  that  applies  for  S\ti  e2jpi  so 
it  too  is  undefined. 

The  <  >  type  constructor  is  modeled  using  lifting;  the  se- 
mantis  of  force  and  <_>  move  to  and  fro  between  lifted 
CPOs  and  partial  functions.  It  may  seem  odd  that  we  lue 
two  different  notations  —  Lift  r  in  £1  and  <t>  in  £2  with 
the  same  underlying  semantic  model,  namely  lifting.  The 
reason  is  that  in  £1  we  use  lifting  as  a  monad  (with  a  bind 
operation,  for  example),  whereas  in  £2  we  use  it  to  model 
thunks  (with  a  force  operation  but  no  bind). 

The  entire  semantics  of  £2  could  instead  be  presented  in  the 
CPO  of  total  functions,  using  the  isomorphism: 

s  ->•  r  S  S  -t  Tx 

'\\Tiich  to  choose  is  just  a  matter  of  taste.  1\Tiat  we  like 
about  our  presentation  is  that  each  £2  type  constructor  cor¬ 
responds  directly  to  a  single  categorical  type  con«ructor, 
whereas  in  the  alternative  presentation  the  £2  function  type 
gets  a  more  “encoded”  translation.  Launchburj'  it  Baraki 
[1996]  use  partial  functions  in  essentially  the  same  way. 

The  translation  of  “ML”  into  £2  is  exactly  the  same  as  the 
translation  of  £1.  The  translation  of  “Haskell”  is  diffa- 
ent  however,  because  we  now  have  to  be  explicit  about  the 
introduction  of  thunks  (Figure  9).  Concerning  types,  no¬ 
tice  the  use  of  the  type  constructor  <_>  on  the  argumentt 
of  functions  and  data  constructors.  Concerning  terms,  the 
thunk-former  <_>  is  used  for  function  arguments  and  the 
right-hand  side  of  all  let  and  letrec  definitions.  Thunks 
are  evaluated  explicitly,  using  force,  when  returning  a  vari¬ 
able  or  the  result  of  f  st  or  snd. 

4.1  Controlling  evaluation  in  £2 

The  main  benefit  of  using  £2  is  that  its  semantics  permit 
an  eager  interpretation  of  vanilla  let;  namely,  “evaluate  the 
right-hand  side,  bind  the  value  to  the  variable,  and  then 
evaluate  the  body”.  A  consequence  is  that  any  variable  of 
type  other  than  <r>,  or  a  type  variable  {which  might  be  tn- 
stantiated  to  <r>),  is  sure  to  be  fully  evaluated,  just  as  m 
au}'  ML  implement  at  ion. 


4.2  Recursion  in  £2 

Another  advantage  of  £2  is  that  we  can  solve  our  earUer 
difficulties  with  recursion  (Section  3.5)  witnout  requirmg 
bounded  quantification. 

Firstly,  we  more  or  less  have  to  restrict  letrecs  to  bind 
only  sTOtactic  values,  because  we  cannot  eagerly  evaluate 
the  right-hand  side.  (Why  not?  Because  we  cannot  con- 
Struct  the  environment  in  which  to  evaluate  it.)  That  m 
turn  means  that  the  meaning  of  the  right-hand  side  is  al- 
ways  defined,  which  is  why  there  is  no  side  condition  m  the 
semantics  of  letrec. 

But  Figtire  7  further  restricts  the  right-hand  side  of  a  letrec 
to  be  a  particular  sort  of  syntactic  value,  a  pointed  value,  or 
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Figure  3:  Semantics  of  Ci 


PValue.  The  syntactic  category  of  PValues  is  chosen  so 
that  it  can  only  denote  a  value  from  a  pointed  domain,  and 
hence  a  letrec  definition  always  has  a  least  fbcpoint.  To  see 
this,  consider  the  forms  that  a  PValue  can  take: 

•  A  lambda  abstraction  denotes  a  partial  function,  and 
the  CPO  of  partial  functions  is  always  pointed:  its  least 
element  is  the  everywhere  undefined  function. 

•  A  thtmk  <e>,  where  e  :  r,  is  drawn  from  the  pointed 
CPO 

Fortunately,  the  syntactic  restriction  of  letrec  does  not  lose 
any  useful  expressiveness.  ML  insists  that  letrecs  bind  only 
functions  (which  are  PValues),  while  Haskell  binds  thunks 
(which  are  adso  PValues).  So  there  is  no  difficulty  with 
translating  the  recursion  arising  in  both  ML  and  Haskell 
into  Cz. 


4.3  Why  not  have  just  one  monad? 

Now  that  we  have  eliminated  the  Lift  monad,  and  made 
vanilla  let  eager,  there  is  another  question  we  should  ask: 
why  not  give  vanilla  let  the  semantics  of  letsx,  and  elimi¬ 
nate  the  latter  altogether?  To  put  it  another  way,  we  have 
made  eager  evaluation  implicit  in  the  semantics  of  let;  why 
not  add  implicit  side  effects  as  well?  After  all,  the  code  gen¬ 
erated  for  letsT  x<^e  in  b  will  be  something  like  “the  code 
for  e  followed  by  the  code  for  6” ,  and  that  is  just  the  same 
as  the  code  we  now  expect  to  generate  for  let  x  « e  in  6. 

However,  if  we  have  just  one  form  of  let  we  lose  valuable 
optimising  transformations.  In  particular,  the  sequence  of 


computations  in  ST  must  be  maintained,  whereas  let  bind¬ 
ings  can  be  re-ordered  freely.  Changing  the  order  of  evalua¬ 
tion  is  fundamental  to  several  useful  transformations,  in¬ 
cluding  common  sub-expression,  loop  invariant  computa¬ 
tions,  all  kinds  of  code  motion  (Peyton  Jones,  Partain  ^ 
Santos  [19961),  inlining,  and  strictness  analysis  (remember 
we  may  be  compiling  a  lazy  language  into  Ci).  To  take  a 
simple  example,  the  following  transformation  is  not  in  gen¬ 
eral  valid  for  letsx,  but  is  valid  for  vanilla  let  (assuming 
there  are  no  name  clashes): 

let  xi  =ei  in  let  xz  =  62  in  6 

let  X2  =62  in  let  xi  »ei  in  b 

Of  course,  one  could  do  an  effects  analysis  to  determine 
which  sub-expressions  were  pure,  as  good  ML  compilers  do, 
, . .  but  that  is  effectively  just  what  the  monadic  type  system 
records! 


5  Assessment 
5.1  C\  vs  £2 

What  have  we  lost  in  the  transition  from  Ci  to  £2,  apart 
from  a  somewhat  more  complicated  semantics?  One  loss  is 
£i’s  ability  to  describe  types  whose  values  are  sure  to  termi¬ 
nate.  If  a  £i  function  has  type  Int->Int  then  a  call  to  the 
function  cannot  diverge;  but  the  same  is  not  true  of  £2.  This 
does  not  have  much  impact  on  a  compiler,  but  it  make  pro¬ 
grammer  reasoning  about  £2  programs  more  complicated. 
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Another  important  difference  is  that  has  a  weaker  Q  rule. 
£i  has  full  ^-conversion.  That  is,  for  any  expressions  c  and 

let  X  *=  e  in  b  =  b[e/x] 

(A  similar  rule  holds  for  application,  of  course.)  In  £2?  how¬ 
ever,  Q  does  not  hold  in  general.  A  particular  case  of  this  is 
that  if  X  is  not  mentioned  in  b  then  in  Li  the  binding  can 
be  discarded;  in  £2  the  binding  can  only  be  discarded  if  the 
right-hand  side  is  a  value. 

However  8v  —  a  restricted  version  form  of  8  that  allows 
only  vttiues  to  be  substituted  —  is  valid  in  £2.  Values  are 
defined  in  Figure  7,  and  include  variables,  constante,  and 
lambda  abstractions,  as  usual.  However,  values  also  mclude 
thunks.  Hence  any  Haskell  8  reduction  has  a  corre^ondmg 
8v  reduction  in  its  £2  translation.  Thus,  the  restriction  to 
8v  will  not  prevent  a  Haskell  compUer  from  doing  anyttog 
it  can  do  in  an  implicitly  lazy  language  with  a  full  8 

Thus  far  we  have  assumed  a  call-by-name  semantics,  in 
which  we  are  content  to  duplicate  arbitrarj-  amounts  of  work 
provided  we  do  not  change  the  overall  result.  In  practice  no 
compUer  would  be  so  liberal;  we  desire  a  call-by-need  s^ 
mantics  in  which  work  is  not  duplicated.  As  Ariola  et  al. 
[1995]  describes,  we  can  give  a  call-by-need  semantics  to 
C\  by  weakening  /?  to  0v  and  adding  a  garbage-collection 
rule  that  allows  an  unused  let  binding  to  be  discarded.  An 
analogous  result  holds  in  £2:  we  can  obtain  call-by-need  se¬ 
mantics  by  replacing  <e>  by  <u>  in  the  definition  of  values 
in  Figure  7. 


5.2  C2  vs  Haskell  and  ML  ILs 

Our  main  theme  is  the  search  for  an  IL  that  can  ser\'e  for 
both  ML-  and  Haskell-like  languages.  However,  we  believe 
that  a  language  like  £2  is  attractive  in  its  own  right  to  eith^ 
community  in  isolation,  because  one  might  get  better  code 
from  an  £2-based  compiler. 

For  the  Haskell  compiler  writer  £2  offers  the  ability  to 
press  in  its  type  that  a  value  is  certainly  evaluated.  This 
gives  a  nice  wav  to  express  the  results  of  strictness  anal^is: 
a  function  argument  of  unpointed  type  must  be  passed  by 
value.  Flat  arrays  and  strict  data  structures  also  become 
expressible. 

For  the  ML  compiler  writer  £2  offers  the  ability  to  expr^s 
the  fact  that  a  computation  is  free  from  side  effects,  which 
is  a  precondition  for  a  raft  of  useful  transformation  (Sec¬ 
tion  4.3).  While  this  information  can  be  gleaned  from  ^ 
effects  analysis,  maintaining  this  information  for  every  sut> 
expression,  across  substantial  program  transformations  is 
not  easy.  In  £2,  however,  local  transformations  can  per¬ 
form,  and  record  the  results  of,  a  simple  incremental  effects 
analysis.  For  example,  consider  the  following  ML  function: 

fun  f  X  =  fst  (fst  x) 

If  we  translate  this  into  £2  we  obtain: 

f  =  retsT  letsT  letgx  al<-rets7  x  in 

retsT 

in 

retgx  (fst  a2)) 

Simple  application  of  the  rules  of  Figure  3  allows  this  ex¬ 


pression  to  simplify  to: 

f  *  retsT  CAx.  let  al-x  in 

let  a2*fst  al  in 
retgx  (fst  a2)) 

Now  the  retsT  can  be  floated  outwards,  to  give: 

f  =  retgx  (Ax.retsx 

In  this  form,  the  inner  retsx  makes  it  apparent  that  f  has 
no  side  effects.  We  have,  in  effect,  performed  a  sort  of  incre¬ 
mental  effects  analysis.  The  same  idea  can  be  taken  further. 
If  f  is  inlined  at  its  call  sites,  then  the  retsT  <^cel 
with  letsx  there,  and  so  on.  Even  if  f  *s  body  is  big,  we 
can  use  the  ‘Wker-wrapper’^  technique  of  Peyton  Jones  & 
LaunchbuT}^  [1991]  to  split  f  into  a  small,  inlinable  wrapper 
and  a  large,  non-inlinable  worker^  fw,  thus: 

f  *retsx  (^^t.retsx  (^^  ^c)) 
fy  =Ax.(...6ody  off...) 

Blume  k  Appel  [1997]  describe  a  similar  technique  that  they 
call  “lambda-splitting”. 

The  point  of  all  this  is  that  there  is  a  real  payoff  for  an 
ML  compiler  from  making  the  ST  monad  explicit.  Easy,  in¬ 
cremental  transformations  perform  a  loc^  effects  an^ysis; 
at  each  stage  the  state  of  the  analysis  is  recorded  in  the 
program  itself,  rather  than  in  some  ad  hoc  aimliax}’  data 
structures;  and  all  other  program  transformations  will  au¬ 
tomatically  preserve  (or  exploit)  the  analysis. 


5.3  Parametricity 

Polymorphic  functions  have  certain  parametricity  property 
that  mav  be  derived  purelv  from  their  types  (MitcheU  k 
Mever  [19851;  Reynolds  [1983];  Wadler  [1989]).  Forex^ple, 
in  the  pure  polymorphic  lambda  calculus,  a  function  /  with 
tvpc  Vct.o  o  ^  Q'  satisfies  the  theorem. 

VAB  .>/h:A-^B.'ix,y:A.h{fxy)  =  f{hx){hy) 

In  fact,  f  satisfies  something  even  stronger  in  whiti  ^e 
function  h  can  be  an  arbitrary  relation  between  A  and  B. 

When  we  add  “polymorphic"  constants  to  the  pure  calculus, 
the  effect  is  that  the  choice  of  functions  h  becomes  restricted. 
For  example,  adding  a  fix  point  operator  fix  :  Va.(a  y 
a  forces  the  restriction  that  the  h  functions  be  strict  (map  J. 
to  ±)  and  inductive  (i.e.  continuous).  This  is  the  situation 
in  Haskell,  for  example. 

Adding  polymorphic  sequencing,  say  through  an  operator 
se?  :  Vq,;0.q  -+  /3  or  by  building  it  into  the  seman¬ 

tics  of  function  application,  forces  the  restriction  that  the 
h  functions  be  bottom-refiecting  (i.e.  defined  on  all  defined 
arguments).  This  is  the  basic  situation  in  pure  ML. 

Adding  polymorphic  equalitj’  forces  the  h  functions  to  be 
at  least  one-to-one;  and  adding  polymorphic  state  op^- 
tions  like  !r  seems  to  remove  any  last  shreds  of  interesting 
parametricity. 

What,  then,  are  the  parametricity  properties  of  £i  and  £2? 
If  parametridt}’  properties  are  weakened  by  claiming  various 
primitives  to  be  more  polymorphic  than  they  really  are,  then 
by  being  more  cautious  in  the  types  we  assign  them,  we  may 
hope  to  restrengthen  parametridty. 
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In  jC2>  for  example,  recursion  is  only  done  either  at  a  func¬ 
tion  type,  or  at  a  suspension  type.  Recursion  is  never  per¬ 
mitted  as  a  fully  polymorphic  type  (unlike  in  Haskell) .  This 
has  the  effect  of  allowing  the  strictness  side  condition  to 
be  dropped,  though  inductiveness  (or  continuity)  is  still  re¬ 
quired.  The  same  is  achieved  in  £i  through  the  use  of  the 
pointed  restriction  (see  Launchbury  Paterson  [1996]  for  a 
comparable  situation).  Furthermore,  since  all  state  opera¬ 
tions  are  explicitly  typed  within  the  state  monad,  they  also 
do  not  interfere  with  parametricity  in  a  negative  way. 

The  main  difference  between  Ci  and  £2  is  to  do  with  forcing 
evaluation,  C\  has  no  polymorphic  forcing  operation,  so  has 
no  consequent  weakening  of  its  parametricity  property.  C2 
does,  however  —  it  is  built  into  its  eager  function  applica¬ 
tion.  Thus  for  £2  the  parametricity  theorem  demands  the 
h  functions  to  be  everywhere  defined. 

To  see  an  example  of  this,  consider  the  function  K  \ 
Va,  0  -¥  a  which  selects  its  first  argument,  discarding 

its  second.  The  parametricity  theorem  is 

.  Vhi  :  ,4  ,4',/i2  :  5  Vx  :  A, y  :  S  . 

hi  {K  xy)^K  (hi  x)  {h2  u) 

Clearly  this  holds  only  if  /12  is  total  (defined  everywhere), 
otherwise  the  right  hand  side  may  not  be  defined  when  the 
left  hand  side  is. 

There  is  a  practical  implication  to  this.  A  class  ot  techniques 
for  removing  intermediate  lists  called  foLdr-huild  relies  on 
parametricity  for  its  correctness  (Gill,  Launchbury  Ic  Pey¬ 
ton  Jones  [1993]).  While  a  strictness  side  condition  is  not 
damaging,  a  tot^ity  condition  is  coo  restrictive.  The  tech¬ 
nique  can  no  longer  rely  on  the  types  to  provide  sufficient 
guidance  for  correctness.  This  is  disappointing,  although 
unsurprising.  The  compiler  can  still  recover  the  short-cut 
deforestation  technique  by  refining  C-z's  type  system  to  use 
qualified  types  along  the  lines  of  Launchbury  Ic  Paterson 
[1996]. 

5,4  Side  effects  auid  polymorphism 

It  is  well  known  that  the  ability  to  create  polymorphic  ref¬ 
erences  can  lead  to  unsoundness  in  the  type  system  (Tofte 
[1990]).  For  example,  if  we  aure  able  to  create  a  reference 
r  with  type  Va.Ref  a  then  we  would  be  able  to  write  the 
following  erroneous  code: 

latst  0  <-wr  (r  Inx)  2  in 

letsT  f  •  CIn.t->Int)  <- rd  (r  (Int->Int))  in 

retsx  (f  3) 

However  in  both  Ci  and  £2  any  expression  of  type  Va.Ref  a 
is  undefined  in  any  environment!  The  only  way  to  construct 
a  value  of  Ref  type  is  with  new,  which  returns  a  value  of  type 
ST  (Ref  r).  The  only  way  to  strip  off  the  ST  constructor  is 
with  latsT.  Looking  at  the  typing  rule  for  lets^,  we  can 
see  that  bound  variable  must  have  type  Ref  r. 

SML’s  so-called  'Value  restriction’’  conservatively  restricts 
generalisation  in  let  bindings  precisely  to  avoid  the  con¬ 
struction  of  such  polymorphic  references.  We  conjecture 
(though  we  have  not  proved)  that  £i  and  £2  are  both  sound 
without  ciny  such  side  conditions. 


5.5  ML  thunks 

One  of  the  advantages  of  a  language  that  supports  both 
strict  and  lazy  evaluation  is  that  it  can  accommodate  source 
languages  that  have  such  a  mixture.  Indeed,  it  is  quite 
straightforward  to  map  Haskell’s  strictness  annotations  (Pe¬ 
terson  et  al.  [1997])  onto  £2.  Coming  from  the  other  direc¬ 
tion,  it  has  long  been  known  that  thunks  can  be  encoded 
explicitly  in  a  strict,  imperative  language.  For  the  sake  of 
concreteness  we  use  the  notation  proposed  for  ML  in  Okasaki 
[1996].  In  this  proposal  delayed  ML  expressions  are  prefixed 
by  a  thus: 

let  val  z  =»  $(f  y)  in  b  end 

Here,  assuming  (f  7)  has  type  int,  z  is  bound  to  a  thunk 
of  type  int  snsp  that,  when  forced,  evaluates  (f  y)  and 
overwrites  the  thimk  with  its  value. 

We  expected  that  these  “ML  thunks”  would  map  directly 
onto  £2*3  thunks,  but  that  turned  out  not  to  be  the  case. 
The  semantics  of  ML  thunks  is  considerably  more  compli¬ 
cated  than  that  of  £2*3  thunks,  because  of  the  interaction 
with  state.  Consider  the  following  ML  expression: 

let  val  rec  z  =  $(lQt  val  y  =  !r  -  1  in 
r:=y; 

if  7*0  then  0 

else  force  z  +  force  z 
end) 

in  . . .  end 

(This  defines  x  recursively,  which  is  not  possible  in  ML,  but 
essentially  the  same  thing  can  be  done  using  another  refer¬ 
ence  to  “tie  the  knot” .  We  use  the  recursive  form  to  reduce 
clutter.)  When  z  is  evaluated  it  decrements  the  contents  of 
the  reference  cell  r;  but  then,  if  the  new  value  is  non-zero, 
z  evaluates  itself!  In  effect,  there  can  be  muitipie  simulta¬ 
neous  activations  of  z,  rather  like  the  muitipie  activations 
of  a  recursive  function.  (Indeed,  a  non-memoising  imple¬ 
mentation  of  ML  thunks  can  be  obtained  by  representing  $e 
by  A().e.)  Furthermore,  these  muitipie  activations  can  each 
have  a  different  value,  because  they  each  read  the  state. 

£2’s  thunks  have  a  much  simpler  semantics.  .A  thunk  has 
only  one  value,  and  there  can  be  at  most  one  activation 
of  the  thunk'.  The  key  insight  is  that  evaluation  of  a  £2 
thunk  has  no  side  effects^  unlike  the  ML  thunk  above.  But 
what  if  the  contents  of  the  thunk  performs  side  effects?  For 
example: 

let  x  =  <lets7  V  :  Int<-rd  r  in  wr  (v+l)>  in  e 

Here,  if  r  :  Ref  Int,  then  x  has  type  <ST  ()>,  not  <()>. 
Forcing  the  thunk  (with  force)  causes  no  side  effects  (apart 
from  updating  the  thunk  itself),  and  yields  a  computation 
that,  when  subsequently  performed  (by  a  let 37),  will  incre¬ 
ment  the  location  r.  The  computation  x  may  be  performed 
many  times;  for  example,  e  might  be 

letsx  al :  0  <-  force  x  in  lets7  a2  :  0  <-  force  z  in  . . . 

What  this  means,  though,  is  that  the  more  complicated  se¬ 
mantics  of  ML  thunks  have  to  be  expressed  explicitly  in  £2* 
presumably  by  coding  them  up  using  explicit  references. 

‘  More  precisely,  if  there  is  more  than  one  then  the  thunk’s  value 
depends  on  its  own  value,  so  its  value  is  undefined.  This  property 
Justifies  the  well-known  technique  of  “black- holing”  a  thunk,  both 
to  avoid  space  leaks  and  to  report  certain  non- termination  (Jones 
[1992]). 
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6  Related  work 

The  FLINT  language  has  rather  similar  objectives  to  the 
work  described  here,  in  that  it  aims  to  serve  as  a  common 
infrastructure  for  a  variety  of  higher-order  t3rped  source  lan¬ 
guages  (Shao  [1997b]).  However,  FLINT  has  not  (so  far) 
concentrated  much  on  the  issue  of  strictness  and  laziness, 
which  is  the  main  focus  of  this  paper.  The  ideas  described 
here  could  readily  be  incorporated  in  FLINT. 

Both  the  Glasgow  Haskell  Compiler  and  the  TIL  ML  com¬ 
piler  use  a  pol3rmorphic  strongly-typed  internal  language, 
though  the  latter  is  considerably  more  sophisticated  and 
complex  (Pe5aon  Jones  [1996];  Tarditi  et  2d.  [1996]).  Nei¬ 
ther,  however  seriously  attempt  to  compile  the  other's  main 
evaluation-order  paradigm. 


7  Further  work 

In  this  paper  we  have  concentrated  on  a  core  calculus.  Some 
work  remains  to  extend  it  to  a  practical  IL; 

•  Recursive  data  types  and  case  expressions  must  be 
added  —  we  anticipate  no  difRcult}'  here. 

•  A  proof  of  type  soundness  is  needed.  As  we  note  in 
Section  5.4  its  soundness  is  not  obvious. 

•  We  have  a  simple  operational  semantics  for  £2;  we  are 
confident  that  it  is  sound  and  adequate,  but  have  yet 
to  do  the  proofs. 

•  We  are  studying  whether  is  is  possible  to  combine  £1  ’s 
ability  to  describe  certainly-terminating  computations 
with  £2*5  operational  model. 

Accommodating  the  ML  module  system  is  likely  to  involve 
a  significant  extension  of  the  t\q>e  system  (Harper  Stone 
[1997]);  we  have  not  yet  studied  such  extensions. 

In  a  separate  paper  we  discuss  how  to  use  the  framework  of 
Pure  Type  Systems  to  allow  the  language  of  terms,  types, 
and  kinds  to  be  merged  into  a  single  language  and  compiler 
data  type  (Peyton  Jones  k  Meijer  [1997]).  We  hope  to  merge 
the  results  of  that  paper  and  this  one  into  a  single  IL.  ' 

We  have  made  no  attempt  to  address  the  tricky  problem 
of  how  to  combine  monads.  For  example,  ML  includes  the 
monad  of  state  and  exceptions.  Is  it  advantageous  to  sepa¬ 
rate  them  into  the  composition  of  two  monads,  or  is  it  better 
to  have  a  single,  combined  monad?  In  the  former  case,  what 
transformations  hold? 

An  important  operational  question  is  that  of  the  repre5eTi- 
tation  of  values,  especially  numbers.  Quite  a  few  papers 
have  discussed  how  to  use  unboxed  representations  for  data 
values,  and  it  would  be  interesting  to  translate  their  work 
into  the  framework  of  £2  (Leroy  [1992];  Peyton  Jones  k 
Launchbury  [1991];  Shao  [1997a]). 
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Abstract 

In  writing  this  paper  we  had  two  goals.  First,  to  promote  MetaML,  a  program- 
ming  language  for  writing  staged  programs,  and  second,  to  demonstrate  that  staging 
a  program  can  have  significant  benefits.  We  do  this  by  example:  the  derivation  of 
an  executable  compiler  for  a  small  language.  We  derive  the  compiler  in  a  rigorous 
fashion  from  a  semantic  description  of  the  language.  This  is  done  by  staging  a  de- 
notational  semantics,  expressed  as  a  monadic  interpreter.  The  compiler  is  a  program 
generator,  taking  a  program  in  the  source  language  (a  while-program)  as  input  and 
producing  an  ML  program  as  target.  The  ML  program  produced  is  in  a  restricted 
subset  of  ML  over  which  the  programmer  has  complete  control.  It  is  encapsulated  in 
a  special  data-structure  called  code.  The  met  a- programming  capabilities  of  MetaML 
allow  this  data-structure  to  be  directly  executed  (run-time  code  generation),  or  to  be 
analysed.  We  illustrate  this  analysis  of  generated  code  to  build  a  source  to  source 
transformation  which  applies  the  monad  laws  to  significantly  improve  the  generated 
code. 


1  Compilers  as  staged  interpreters 

Interpreters,  when  implemented  in  high-level  declarative  languages,  are  very  close  to  the 
interpreted  language’s  denotational  semantics.  Because  of  this,  interpreters  are  usually 
used  for  the  development  of  prototypes,  but  such  prototypes  lack  both  efficiency  and  any 
connection  to  the  underlying  system  in  which  the  compiled  code  must  run.  If  expi('sse<i 
in  a  monadic  style,  an  interpreter  can  be  mapped  closer  to  the  unilerlyiiig  >\sieiii.  uud 
the  structuring  properties  of  the  monad  even  allow  the  interpreter  to  be  reused  as  the 
system  evolves  [32,  14,  27].  Nevertheless,  the  effort  used  to  build  the  interpreter  is  often 
considered  wasteful  since  the  programmer  still  needs  to  re-implement  the  compiler  from 
scratch  after  building  the  interpreter. 

Our  solution  to  this  problem  is  the  following  multi-step  method.  First,  construct  the 
denotational  semantics  as  an  interpreter  in  a  functional  language.  Second,  capture  the 
effects  of  the  language,  and  the  environment  in  which  the  target  language  must  run,  in 
a  monad.  Then  rewrite  the  interpreter  in  a  monadic  style.  Third,  stage  the  interpreter 
using  meta-programming  techniques.  This  staging  is  similar  to  the  staging  of  interpreters 
using  a  partial  evaluator,  but  is  explicit  rather  than  implicit,  since  the  programmer  places 
the  annotations  directly,  rather  than  using  an  automatic  binding  time  analysis  to  discover 
where  they  should  be  placed.  This  leaves  programmers  in  complete  control,  and  they  can 
limit  what  appears  in  the  residual  program.  Fourth,  the  resulting  program  is  both  a  data- 
structure  and  a  program,  so  it  can  be  both  directly  executed  and  analysed.  This  analysis 
can  include  both  source  to  source  transformations,  or  translation  into  another  form  (i.e. 
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intermediate  code  or  assembly  language).  Because  the  programmer  has  complete  control 
over  the  structure  of  the  residual  program  this  can  be  a  trivial  task. 

Staging  of  interpreters  using  partial  evaluation  has  been  done  before  [2,  4].  The  con¬ 
tribution  of  this  paper  is  to  show  that  this  can  all  be  done  in  a  single  program.  A  system 
incorporating  staging  as  a  first  class  feature  of  a  language  is  a  powerful  tool.  While  using 
such  a  tool  to  write  a  compiler  the  source  language  can  be  given  semantics,  it  can  be 
staged,  translated,  and  optimized  all  in  a  single  paradigm.  It  requires  neither  additional 
processes  nor  tools,  and  is  under  the  complete  control  of  the  programmer;  all  the  while 
maintaining  a  direct  link  between  the  semantics  of  interpreter  and  those  of  the  compiler. 
Staging  organizes  the  task  of  constructing  a  compiler  into  simple,  incremental  steps,  where 
the  semantic  connection  is  maintained  through  each  stage  of  the  derivation.  Each  step 
is  a  relatively  easy  task  compared  to  building  a  compiler  from  scratch.  Constructing  a 
com j)il(’r  using  a  staged  language  has  the  following  benefits: 

•  Simplicity.  Each  task  is  a  simple  one,  and  builds  incrementally  on  the  previous 
tasks. 

•  Correctness.  The  compiler  remains  connected  to  its  semantics.  Each  artifact  pro¬ 
duced  by  a  task,  is  provably  correct  with  respect  to  the  artifacts  of  the  previous  tasks. 
The  final  artifact  is  both  a  compiler  for  the  language  and  a  semantics  equivalent  to 
the  original  semantics. 

•  Reuse.  Each  artifact  reuses  the  code  of  the  previous  artifact. 

•  Control.  I  hc  programmer  has  complete  control  over  the  resulting  output.  He 
develops  his  program  with  staging  in  mind,  and  the  completely  controls  the  structure 
of  the  residual  program. 


2  Staging  in  MetaML 

Meta  ML  is  almost  a  conservative  extension  of  Standard  ML.  Its  extensions  include  four 
staging  annotations.  To  delay  an  expression  until  the  next  stage  one  places  it  between 
ino>ta-brackets.  Thus  the  expression  <23>  (pronounced  “bracket  23”)  has  type  <int> 
(pronounced  “code  of  int”).  We  illustrate  the  important  features  of  the  staging  annotations 
ill  the  short  .M ETA M L  session  below. 

-I  val  2  =  3+4; 
val  2=7:  int 

-|  val  quad  =  (  3+4,  <3+4>,  lift  (3+4),  <z>  ); 

val  quad  =  (7,  <3  '/.+  4>,  <7>,  <V,z>  )  : 

(  int  ♦  <int>  ♦  <int>  ♦  <int>) 

-I  fun  inc  x  =  <1  +  ■x>; 

val  inc  =  Fn  :  C’a].<int>  ->  <int> 

-|  val  six  =  inc  <5>: 
val  six  =  <1  */.+  5>  :  <int> 

-1  run  six; 
val  it  =  6  :  int 

Users  access  MetaML  through  a  read-type-eval-print  top-level.  The  declaration  for  z 
is  read,  typed  to  see  that  it  has  a  consistent  type  (int  here),  evaluated  (to  7),  and  then 
both  its  value  and  type  are  printed. 
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The  declaration  for  quad  contrasts  normal  evaluation  with  the  three  ways  objects  of 
type  code  can  be  constructed.  Placing  brackets  around  an  expression  (<3+4>)  defers  the 
computation  of  3+4  to  the  next  stage,  returning  a  piece  of  code.  Lifting  an  expression 
(lift  (3+4))  evaluates  that  expression  (to  7  here)  and  then  lifts  the  value  to  a  piece  of 
code  that  when  evaluated  returns  the  same  value.  Brackets  around  a  free  variable  (<z>) 
creates  a  new  constant  piece  of  code  with  the  value  of  the  variable.  Such  constants  print 
with  a  '/,  sign  to  indicate  they  are  constants.  We  call  this  lexical-capture  of  free  variables. 
Because  in  MetaML  operators  (such  as  +  and  *)  are  also  identifiers,  free  occurrences  of 
operators  often  appear  with  */,  in  front  of  them. 

The  declaration  of  the  function  inc  illustrates  that  larger  pieces  of  code  can  be  con¬ 
structed  from  smaller  ones  by  using  the  escape  annotation.  Bracketed  expressions  can 
be  viewed  as  frozen,  i.e.  evaluation  does  not  apply  under  brackets.  However,  is  it  often 
convenient  to  allow  some  reduction  steps  inside  a  large  frozen  expression  while  it  is  being 
constructed,  by  “splicing”  in  a  previously  constructed  piece  of  code.  Me'I'aML  allows  one 
to  escape  from  a  frozen  expression  by  prefixing  a  sub-expression  within  it  w  ith  the  tilde 
(■)  character.  Escape  must  only  appear  inside  brackets. 

In  the  declaration  for  six,  the  function  increment  is  applied  to  the  piece  of  code  <5> 
constructing  the  new  piece  of  code  <1  '/,+  5>. 

Running  a  piece  of  code,  strips  away  the  enclosing  brackets,  and  evaluates  the  expres¬ 
sion  inside. 

3  Monads  in  MetaML 

We  assume  the  reader  has  a  working  knowledge  of  monads[30,  33].  We  use  the  unit  and 
6md  formulation  of  monaxis[32].  In  MetaML  a  monad  is  a  data  structure  encapsulating 
a  type  constructor  M  and  the  unit  and  bind  functions. 

datatype  (’M  :  *  ->  *  )  Monad  =  Mon  of 

([*a].  'a  ->  ’a  ’M)  ♦  (*  unit  function  ♦) 

(C’a.'b].  ’a  'M  ->  (’a  ->  'b  'M)  ->  'b’M);  (*  bind  function  ♦) 

This  definition  uses  SML’s  postfix  notation  for  type  application,  and  two  non-standard 
extensions  to  ML.  First,  it  declares  that  the  argument  (’M  :  *  ->  *  )  of  the  type  con¬ 
structor  Monad  is  itself  a  unary  type  constructor  [8].  We  say  that  ’M  has  kind:  *  -> 

*.  Second,  it  declares  that  the  arguments  to  the  constructor  Mon  must  be  polymorphic 
functions  [21].  The  type  variables  in  brackets,  e.g.  C’a,’b],  are  universally  (|uanli[ioil. 
Because  of  the  explicit  type  annotations  in  the  datatype  definitions  the  effect  of  these  ex¬ 
tensions  on  the  Hindley-Milner  type  inference  system  is  well  known  and  poses  no  problems 
for  the  MetaML  type  inference  engine. 

In  MetaML,  Monad  is  a  first-class,  although  pre-defined  or  built-in  type.  In  particular, 
there  are  two  syntactic  forms  which  are  aware  of  the  Monad  datatype:  Do  and  Return.  Do 
and  Return  are  MetaML’s  syntactic  interface  to  the  unit  and  bind  of  a  monad.  We  have 
modeled  them  after  the  do-notation  of  Haskell[10,  24].  An  important  difference  is  that 
MetaML’s  Do  and  Return  are  both  parameterized  by  an  expression  of  type  ’M  Monad. 
Users  may  freely  construct  their  own  monads,  though  they  should  be  very  careful  that 
their  instantiation  meets  the  monad  axioms.  Do  and  Return  are  syntactic  sugar  for  the 
following: 

(*  Syntactic  Sugar  Derived  Form  ♦) 

Do  (Mon(unit,bind))  {  x  <-  e;  f  >  =  bind  e  (fn  x  =>  f) 

Return  (Mon(unit,bind))  e  =  unit  e 
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In  addition  the  syntactic  sugar  of  the  Do  allows  a  sequence  of  Xj  <-  e,’  forms,  and 
defines  this  as  a  nested  sequence  of  Do’s.  For  example: 

Do  m  {  xl  <-  el;  x2  <-  e2  ;  x3  <-  e3  ;  e4  }  = 

Do  m  •[  xl  <-  el;  Do  m  <  x2  <-  e2  ;  Do  m  {  x3  <-  e3  ;  e4  }}} 

The  monad  laws,  expressed  in  MetaML’s  Do  and  Return  notation  are: 

Do  {  X  <-  Return  e  ;  2  ]•  =  z[e/x] 

Do  {  X  <-  m  ;  Return  x  >  =  m 

Do  {  X  <-  Do  {  y  <-  a  ;  b  }  ;  c  }  =  Do  {  y*  <-  a  ;  Do  -C  X  <-  b[yVy]  ;  c  }  > 

=  Do  -C  y'  <-  a  ;  X  <-  b[y’/y]  ;  c  } 

4  The  three-step  method  for  compiler  development 

In  this  section,  we  illustrate  our  method  by  building  the  front  end  of  a  compiler  for  a  small 
imperative  while-language.  We  proceed  in  three  steps.  First,  we  introduce  the  language 
and  its  denotation^  semantics  by  giving  a  monadic  interpreter  as  a  one  stage  MetaML 
program.  Second,  we  stage  this  interpreter  by  using  a  two  stage  MetaML  program  in 
order  to  produce  a  compiler.  Third,  we  illustrate  the  usefulness  of  the  staging  approach, 
by  defining  a  function  that  takes  the  output  code  of  the  compiler  as  input  and  returns 
an  optimized  version.  This  function  is  simply  a  pattern-matching  based  implementation 
of  the  monadic  identity  and  associativity  laws.  This  makes  a  dramatic  difference  in  the 
quality  of  the  generated  code,  and  is  completely  reusable  because  the  laws  hold  for  any 
monad,  not  just  the  monad  used  in  the  example. 

This  illustrates  the  usefulness  of  combining  the  monadic  and  staged  approaches.  With¬ 
out  the  monadic  structure  of  the  interpreter,  the  usefulness  of  the  monadic-laws  would  have 
to  be  re-captured  in  a  domain  specific  manner  for  every  compiler.  Without  the  structure 
provided  by  the  staging,  the  pattern- matching  based  rewrite  system  would  be  impossible 
to  use,  because  the  compile-time  computations  would  intervene  and  make  recognition  of 
the  patterns  impossible.  In  the  staged  Interpreter,  the  compile-time  code  has  disappeared 
by  the  time  we  want  to  apply  the  pattern  based  monadic-law  transformer. 

4.1  The  w^hile-language 

In  this  section,  we  introduce  a  simple  while-language  composed  from  the  syntactic  elements: 
expressions  (Exp)  and  commands  (Com).  In  this  simple  language  expressions  are  composed 
of  integer  constants,  variables,  and  operators.  A  simple  algebraic  datatype  to  describe  the 
abstract  syntax  of  expressions  is  given  in  MetaML  below: 


datatype  Exp  = 


Constant  of  int 

(• 

5 

♦) 

Variable  of  string 

(* 

X 

*) 

Minus  of  (Exp  ♦  Exp) 

(♦ 

X  “ 

5 

*) 

Greater  of  (Exp  ♦  Exp) 

(• 

X  > 

1 

♦) 

Times  of  (Exp  *  Exp)  ; 

(♦ 

X  * 

4 

*) 

Commands  include  assignment,  sequencing  of  commands,  a  conditional  (1/ command), 
while  loops,  a  print  command,  and  a  declaration  which  introduces  new  statically  scoped 
variables.  A  declaration  introduces  a  variable,  provides  an  expression  that  defines  its 
initial  value,  and  limits  its  scope  to  the  enclosing  command.  A  simple  algebraic  datatype 
to  describe  the  abstract  syntax  of  commands  is: 
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datatype  Com  = 


Assign  of  (string  *  Exp) 

(* 

X  :=  1 

♦  ) 

Seq  of  (Com  *  Com) 

(* 

{  X  :=  1;  y  :=  2 

} 

♦  ) 

Cond  of  (Exp  ♦  Com  *  Com) 

(* 

if  X  then  x  :=  1 

else 

y 

:=  1  *) 

While  of  (Exp  *  Com) 

(♦ 

while  x>0  do  X  := 

=  X  - 

1 

*) 

Decleire  of  (string  *  Exp  *  Com) 

(* 

declare  x  =  1  in 

X  :  = 

X 

-  1  *) 

Print  of  Exp; 

(♦ 

print  X 

*) 

A  simple  while-program  in  concrete  syntax,  such  as 
declare  x  =  160  in 

declare  y  =  200  in  {  while  x  >  0  do  {  x  :=  x  -  1;  y  y  ■’  1  >;  print  y} 
is  encoded  abstractly  in  these  datatypes  as  follows: 
val  SI  = 

Declare  ( **x" ,  Constant  150, 

Declare ("y", Const ant  200, 

Seq(While (Greater (Variable  *‘x” , Constant  0), 

Seq(Assign(*'x*’ ,Minus(Variable  *'x" , Constant  1)), 
Assign(*'y*',Minus(Variable  “y" , Const eint  1)))), 

Print (Variable  "y")))); 

4.2  The  structure  of  the  solution 

Staging  is  an  important  technique  for  developing  efficient  programs,  but  it  requires  some 
forethought.  To  get  the  best  results  one  should  design  algorithms  with  their  staged  solu¬ 
tions  in  mind. 

The  meaning  of  a  while-program  depends  only  on  the  meaning  of  its  component  ex¬ 
pressions  and  commands.  In  the  case  of  expressions,  this  meaning  is  a  function  from 
environments  to  integers.  The  environment  is  a  mapping  between  names  (which  are  in¬ 
troduced  by  Declare)  and  their  values. 

There  are  several  ways  that  this  mapping  might  be  implemented.  Since  we  intend  to 
stage  the  interpreter,  we  break  this  mapping  into  two  components.  The  first  component,  a 
list  of  names,  will  be  completely  known  at  compile-time.  The  second  component,  a  list  of 
integer  values  that  behaves  like  a  stack,  will  only  be  known  at  the  run-time  of  the  compiled 
program. 

The  functions  that  access  this  environment  distribute  their  computation  into  two 
stages.  First,  determining  at  what  location  a  name  appears  in  the  name  list,  and  second, 
by  accessing  the  correct  integer  from  the  stack  at  this  location.  In  a  more  complicated 
compiler  the  mapping  from  names  to  locations  would  depend  on  more  than  just  the  dec¬ 
laration  nesting  depth,  but  the  principle  remains  the  same.  Since  every  variable's  location 
can  be  completely  computed  at  compile-time,  it  is  important  that  we  do  so,  and  that  these 
locations  appear  as  constants  in  the  next  stage. 

Splitting  the  environment  into  two  components  is  a  standard  technique  (often  called  a 
binding  time  improvement)  used  by  the  partial  evaluation  community[9].  We  capture  this 
precisely  by  the  following  purely  functional  implementation. 

type  location  =  int; 
type  index  =  string  list; 
type  stack  =  int  list; 

(*  position  :  string  “>  index  ->  location  *) 
fun  position  name  index  - 

let  fun  pos  n  (nm;:nms)  =  if  name  =  nm  then  n  else  pos  (n+1)  nms 
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in  pos  1  index  end; 


(♦  fetch  :  location  “>  stack  *->  int  *) 

fun  fetch  n  (v::vs)  =  if  n  =  1  then  v  else  fetch  (n-1)  vs; 

(*  put:  location  ->  int  ->  stack  ->  stack  ♦) 

fun  put  n  X  (v::vs)  =  if  n  =  1  then  x::vs  else  v::(put  (n-1)  x  vs); 

The  meaning  of  Com  is  a  stack  transformer  and  an  output  accumulator.  It  transforms 
one  stack  (with  values  of  variables  in  scope)  into  another  stack  (with  presumably  different 
values  for  the  same  variables)  while  accumulating  the  output  printed  by  the  program. 

To  produce  a  monadic  interpreter  we  could  define  a  monad  which  encapsulates  the 
index,  the  stack,  and  the  output  accumulation.  Because  we  intend  to  stage  the  interpreter 
we  do  not  encapsulate  the  index  in  the  monad.  We  want  the  monad  to  encapsulate  only 
the  dynamic  part  of  the  environment  (the  stack  of  values  where  each  value  is  accessed  by 
its  position  in  the  stack,  and  the  output  accumulation). 

The  monad  we  use  is  a  combination  of  monad  of  state  and  the  monad  of  output 

datatype  *a  M  =  StOut  of  (int  list  ->  (^a  *  int  list  ♦  string)); 

fun  unStOut  (StOut  f)  =  f; 

fun  unit  x  =  StOut(fn  n  =>  (x,n, '*")); 

fun  bind  e  f  =  StOut(fn  n  =>  let  val  (a,nl,sl)  =  (unStOut  e)  n 

val  (b,n2,s2)  =  unStOut(f  a)  nl 
in  (b,n2,sl  "  s2)  end); 

val  mswo  :  M  Monad  =  Mon(unit ,bind) ;  (*  Monad  of  state  with  output  *) 

The  non-standard  morphisms  must  describe  how  the  stack  is  extended  (or  shrunk) 
when  new  variables  come  into  (or  out  of)  scope;  how  the  value  of  a  particular  variable  is 
read  or  updated;  and  how  the  printed  text  is  accumulated.  Each  can  be  thought  of  as  an 
action  on  the  stack  of  mutable  variables,  or  an  action  on  the  print  stream. 

(♦  read  :  location  ->  int  M  *) 

fun  read  i  =  StOut (fn  ns  =>  (fetch  i  ns, ns, **")); 

(♦  write  :  location  ->  int  ->  unit  M  *) 

fun  write  i  v  =  StOut (fn  ns  =>(  (),  put  i  v  ns,  ""  )); 

(*  push:  int  ->  unit  M  ♦) 

fun  push  X  =  StOut  (fn  ns  =>((),  x  ::  ns,  *'•')); 

(*  pop  :  unit  M  *) 

val  pop  =  StOut(fn  (n::ns)  =>  ((),  ns,  **")); 

(*  output:  int  ->  unit  M  ♦) 

fun  output  n  =  StOut(fn  ns  =>((),  ns,  (toString  n)***  **)); 

4.3  Step  1:  monadic  interpreter 

Because  expressions  do  not  alter  the  stack,  or  produce  any  output,  we  could  give  an  eval¬ 
uation  function  for  expressions  which  is  not  monadic,  or  which  uses  a  simpler  monad  than 
the  monad  defined  above.  We  choose  to  use  the  monad  of  state  with  output  throughout 
our  implementation  for  two  reasons.  One,  for  simplicity  of  presentation,  and  two  because 
if  the  while  language  semantics  should  evolve,  using  the  same  monad  everywhere  makes 
it  easy  to  reuse  the  monadic  evaluation  function  with  few  changes. 

The  only  non-standard  morphism  evident  in  the  evall  function  is  read,  which  de¬ 
scribes  how  the  value  of  a  variable  is  obtained.  The  monadic  interpretor  for  expressions 
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takes  an  index  mapping  names  to  locations  and  returns  a  computation  producing  an  in¬ 
teger. 

(*  evall:  Exp  ->  index  ->  int  M  *) 
fun  evall  exp  index  = 
case  exp  of 

Constant  n  =>  Return  mswo  n 
I  Variable  x  =>  let  val  loc  =  position  x  index 
in  read  loc  end 

I  Minus (x,y)  => 

Do  mswo  {  a  <-  evall  x  index  ; 

b  <-  evall  y  index; 

Return  mswo  (a  -  b)  } 

1  Greater (x,y)  => 

Do  mswo  -C  a  <-  evall  x  index  ; 

b  <-  evall  y  index; 

Return  mswo  (if  a  b  then  1  else  0)  } 

1  Times (x,y)  => 

Do  mswo  {  a  <-  evall  x  index  ; 

b  <-  evall  y  index; 

Return  mswo  (a  *  b)  }; 

The  interpreter  for  Com  uses  the  non-standard  morphisms  write,  push,  and  pop  to 
transform  the  stack  and  the  morphism  output  to  add  to  the  output  stream. 

(♦  interpret 1  :  Com  ->  index  ->  unit  M  *) 
fun  interpret 1  stmt  index  = 
case  stmt  of 

Assign (name, e)  => 
let  val  loc  =  position  name  index 

in  Do  mswo  {  v  evall  e  index  ;  write  loc  v  }  end 
1  Seq(sl,s2)  => 

Do  mswo  {  X  <-  interpret 1  si  index; 

y  <-  interpret 1  s2  index; 

Return  mswo  ()  } 

I  Cond(e,sl ,s2)  => 

Do  mswo  {  X  <-  evall  e  index; 
if  x=l 

then  interpretl  si  index 
else  interpretl  s2  index  } 

1  While (e, body)  => 
let  fun  loop  0  = 

Do  mswo  {  V  <-  evall  e  index  ; 

if  v=0  then  Return  mswo  () 

else  Do  mswo  {  interpretl  body  index  ; 
loop  0  }  } 

in  loop  0  end 
1  Declare (nm,e, stmt)  => 

Do  mswo  {  v  <-  evall  e  index  ; 
push  V  ; 

interpretl  stmt  (nm:: index); 
pop  } 

I  Print  e  => 

Do  mswo  {  V  <-  evall  e  index; 
output  V  }; 

Although  interpretl  is  fairly  standard,  we  feel  that  two  things  are  worth  pointing 
out.  First,  the  clause  for  the  Declare  constructor,  which  calls  push  and  pop,  implicitly 


7 


rhanses  the  size  of  the  stack  and  explicitly  changes  the  size  of  the  index  (nm: index), 
k('('|>iii<>  I  he  ivvo  ill  syiirli.  It  evaluates  the  initial  value  for  a  new  variable,  extends  the 
index  with  the  variables  name,  and  the  stack  with  its  value,  and  then  executes  the  body  of 
the  Declare.  Afterwards  it  removes  the  binding  from  the  stack  (using  pop),  all  the  while 
implicitly  threading  the  accumulated  output.  The  mapping  is  in  scope  only  for  the  body 
of  the  declaration. 

Second,  the  clause  for  the  While  constructor  introduces  a  local  tail  recursive  function 
loop.  This  function  emulates  the  body  of  the  while.  It  is  tempting  to  control  the  recursion 
introduced  by  the  While  by  using  the  recursion  of  the  interpretl  function  itself  by  using 
a  clause  something  like: 


I  While(e,body)  => 

Do  mswo  {  V  <-  evall  e  index  ; 

if  v=0  then  Return  mswo  () 

else  Do  mswo  {  interpretl  body  index  ; 

interpretl  (While(e,body))  index 


} 


} 


Here,  if  the  test  of  the  loop  is  true,  we  run  the  body  once  (to  transform  the  stack  and 
accumulate  output)  and  then  repeat  the  whole  loop  again.  This  strategy,  while  correct, 
will  have  disastrous  results  when  we  stage  the  interpreter,  as  it  will  cause  the  first  stage 
to  loop  infinitely. 

There  are  two  recursions  going  on  here.  First  the  unfolding  of  the  finite  data  structure 
u  liii  li  lli('  program  being  compiled,  and  second,  the  recursion  in  the  program 

l)eing  compiled.  In  an  unstaged  interpreter  a  single  loop  suffices.  In  a  staged  interpreter, 
both  loops  are  necessary.  In  the  first  stage  we  only  unfold  the  program  being  compiled 
and  this  must  always  terminate.  Thus  we  must  plan  ahead  as  we  follow  our  three  step 
process.  Nevertheless,  despite  the  concessions  we  have  made  to  staging,  this  interpreter  is 
still  clear,  concise  and  describes  the  semantics  of  the  while-language  in  a  straight-forward 
manner. 


4.4  Step  2:  staged  interpreter 

To  specialize  the  monadic  interpreter  to  a  given  program  we  add  two  levels  of  staging 
annotations.  The  result  of  the  first  stage  is  the  intermediate  code,  that  if  executed  returns 
the  value  of  the  program.  The  use  of  the  bracket  annotation  enables  us  to  describe 
precisely  the  code  that  must  be  generated  to  run  in  the  next  stage.  Escape  annotations 
allow  us  to  escape  the  recursive  calls  of  the  interpreter  that  are  made  when  compiling  a 
while-program. 

(♦  eval2:  Exp  ->  index  ->  <int  M>  *) 
fun  eval2  exp  index  = 
case  exp  of 

Constant  n  =>  <Return  mswo  "(lift  n)> 

I  Variable  x  => 

let  val  loc  =  position  x  index 
in  <read  "(lift  loc)>  end 
I  Minus(x,y)  => 

<Do  mswo  {  a  <-  "(eval2  x  index)  ; 

b  <“  "(eval2  y  index); 

Return  mswo  (a  -  b)  }> 

1  Greater(x,y)  => 

<Do  mswo  {  a  <-  “(eval2  x  index)  ; 

b  <-  "(eval2  y  index); 
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Return  mswo  (if  a  *>*  b  then  1  else  0)  >> 

I  Times(x,y)  => 

<Do  mswo  {  a  <-  ''(eval2  x  index)  ; 

b  <-  '’(eval2  y  index); 

Return  mswo  (a  *  b)  }>; 

The  lift  operator  inserts  the  value  of  loc  as  the  argument  to  the  read  action.  The 
value  of  loc  is  known  in  the  first-stage  (compile-time),  so  it  is  transformed  into  a  constant 
in  the  second-stage  (run-time)  by  lift. 

To  understand  why  the  escape  operators  are  necessary,  let  us  consider  a  simple  exam¬ 
ple:  eval2  (Minus (Constant  3, Constant  1))  [].  We  will  unfold  this  example  by  hand 
below: 

eval2  (Minus  (Constcint  3,  Const  ant  1))  []  = 

<  Do  mswo 

{  a  <~  “'(eval2  (Constant  3)  []); 
b  <-  ‘'(eval2  (Consteint  1)  [] ) ; 

Return  mswo  (a-b)}  >  = 

<  Do  mswo 

{  a  <~  ''<Return  mswo  3>; 
b  <-  ‘'<Return  mswo  1>; 

Return  mswo  (a  -  b)}  >  = 

<  Do  mswo 

{  a  <-  Return  mswo  3; 
b  <~  Return  mswo  1; 

Return  mswo  (a  -  b)}  >  = 

<  Do  %mswo 

{  a  <-  Return  5(mswo  3; 
b  <-  Return  Jimswo  1; 

Return  %mswo  (a  b)}  > 

Each  recursive  call  produces  a  bracketed  piece  of  code  which  is  spliced  into  the  larger 
piece  being  constructed.  Recall  that  escapes  may  only  appear  at  level-1  and  higher. 
Splicing  is  axiomatized  by  the  reduction  rule:  "<x>  — >  x,  which  applies  only  at  level-l. 
The  final  step,  where  mswo  and  -  become  y»mswo  and  occurs  because  both  are  free 
variables  and  are  lexically  captured. 

Now  we  can  state  the  equivalence  relationship  between  the  monadic  evall  and  the 
staged  eval2.  We  use  the  axiomatic  semantics  of  Meta  ML  [28],  in  particular  the  axioms 
for  the  annotations,  such  as  the  splice  axiom  above. 

Proposition  1.  For  all  expressions  exp,  and  list  of  names  index: 
evall  exp  index  =  run  (eval2  exp  index) 

Proof.  We  might  argue  that  there  is  a  trivial  proof  to  this  proposition.  Since  evall 
is  simply  a  copy  of  eval2  with  all  the  staging  annotations  erased,  and  that  both  func¬ 
tions  type-check,  by  the  semantics  of  MetaML  they  must  be  equal.  We  include  a  more 
traditional  proof  in  the  appendix  using  the  axiomatic  semantics  of  MetaML  [28]  (see 
appendix  A). 
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Interpreter  for  Commands. 


Staging  Lho  interpreter  for  commands  proceeds  in  a  similar  manner: 

(*  interpret2  :  Com  ~>  index  ->  <unit  M>  *) 
fun  interpret2  stmt  index  = 
case  stmt  of 

AssignCname ,e)  => 
let  val  loc  =  position  name  index 
in  <Do  mswo  {  n  <-  *'(eval2  e  index)  ; 

write  "(lift  loc)  n  }> 

end 

I  Seq(sl,s2)  => 

<Do  mswo  {  X  <“  "(interpret2  si  index); 

y  <“  "(interpret2  s2  index); 

Return  mswo  ()  }> 

I  Cond(e,sl,s2)  => 

<Do  mswo  {  X  <“  "(eval2  e  index); 
if  x=l 

then  "(interpret 2  si  index) 
else  "(interpret2  s2  index)}> 

I  While(e ,body)  => 

<let  fun  loop  ()  = 

Do  mswo  {  V  <“  "(eval2  e  index); 
if  v=0 

then  Return  mswo  () 

else  Do  mswo  {  q  <-  "(interpret2  body  index);  loop  ()> 

} 

in  loop  ()  end> 

I  Declare (nm,e, stmt)  => 

<Do  mswo  {  X  <-  "(eval2  e  index)  ; 
push  X  ; 

"(interpret2  stmt  (nm::index))  ; 
pop  }> 

I  Print  e  => 

<Do  mswo  {  X  <-  "(eval2  e  index)  ; 
output  X  }>; 

4.4.1  An  example. 

The  function  interpret 2  generates  a  piece  of  code  from  a  Com  object.  To  illustrate  this 
we  apply  it  to  the  simple  program:  declare  x  =  10  in  {  x  :=  x  -  1;  print  x  }  and 
obtain: 

<Do  */#mswo 

{  a  <-  Return  '/.mswo  10 
;  y.push  a 
;  Do  %mswo 

{  e  <“  Do  y.mswo 

{  d  <~  Do  ymswo 

{  b  <-  y,read  1 
;  c  <-  Return  5(mswo  1 
;  Return  yimswo  b  c 
> 

;  y«write  1  d 
} 

;  g  <-  Do  y*mswo 
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{  f  <-  %read  1 
;  y, out  put  f 
> 

;  Return  y*mswo  () 

} 

;  ‘/.pop 

» 

Note  that  the  staged  program  is  essentially  a  compiler,  translating  the  syntactic  repre¬ 
sentation  of  the  while-program  into  the  above  monadic  object- program  that  will  compute 
its  meaning.  This  program  sequentializes  the  decrement  x  and  the  print  of  x.  This  object- 
program  is  fully  executable.  Simply  by  using  the  run  operator  of  Mtn'AML.  ii  can  l>(> 
executed  for  prototyping  purposes. 

Equally  important,  the  object-program  itself  is  just  a  piece  of  data,  which  can  be 
analyzed  and  further  translated  in  another  layer  of  the  translation  pipeline.  The  reader 
might  notice  that  this  object-program  could  be  further  simplified  by  applying  the  monad 
laws.  There  are  many  opportunities  for  doing  so.  After  these  laws  are  applied  we  obtain 
the  much  more  satisfying: 

<Do  %mswo 

■C  '/.push  10 
;  a  <-  '/read  1 
;  b  <-  Return  Vjnswo  a  '/,-  1 
;  c  <-  '/.vrite  1  b 
;  d  <-  Xread  1 
;  e  <-  %output  d 
;  Return  '/onswo  () 

;  '/.pop 

» 

In  addition  to  the  monad  laws  which  hold  for  all  monaxis,  we  can  also  use  laws  which 
hold  for  particular  non-standard  morphisms.  For  instance,  in  the  example  above,  we  could 
avoid  the  second  read  of  location  1  using  the  following  rule: 

Do  ■[  el;  c  <-  '/.write  1  b  ;  d  <-  '/.read  1;  e2}  =  Do  {  e;  c  <-  '/.write  1  b;  e2[b/d]> 

Every  target  language  will  have  many  such  laws,  and  because  our  target  language  is 
both  executable-code,  and  data-structure  we  can  perform  these  optimizations.  How  this 
is  accomplished  is  the  subject  of  Section  4.5. 

As  for  the  eval  function,  we  state  the  semantic  equivalence  between  the  monadic  and 
the  staged  interpreters. 

Proposition  2.  For  all  commands  com  and  list  of  names  index: 
interpretl  com  index  =  run  (interpret2  com  index> 

Proof.  See  appendix  A. 

4.5  Step  3:  optimizing  target  code:  the  monadic  laws 

Perhaps  the  most  important  contribution  we  make  in  this  paper,  is  that  a  staged  program 
produces  a  piece  of  code  that  is  both  an  executable-program  and  a  data-structure. 

If  one  wants  to  execute  this  code,  one  uses  the  run  annotation.  If  one  wants  to  optimize 
this  code,  this  is  possible  as  well.  In  this  section  we  illustrate  this  by  example;  providing 
an  implementation  of  the  monad  law  transformations  demonstrated  in  section  4.4.1 
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In  this  section,  we  briefly  explain  our  method  for  analysing  (or  computing  over,  or 
doing  intensional  analysis  of)  MetaML  code.  We  believe,  that  operations  such  as  pattern- 
inaii  liiiig  aii'l  siibslii  iilion  on  code  should  be  provided  once  and  for  all  by  the  system,  and 
not  by  the  user. 

Optimizations  are  generally  thought  of  as  rewriting  rules  or  transformations.  Both  the 
rules  and  the  strategy  (e.g.  top-down  or  bottom-up)  needed  to  apply  them  need  to  be 
described. 

To  illustrate  this  point,  we  write  a  simple  transformation  which  implements  the  monadic 
laws  as  directed  rewrites.  As  a  reminder,  the  monadic  laws  expressed  in  terms  of  MetaML’s 
Do  and  Return  notation  are  repeated. 

Do  {  X  <-  return  e  ;  z  ]•  =  zCe/x] 

Do  {  X  <-  m  ;  return  x  }  =  m 

Do  {  X  <-  Do  {  y  <-  a  ;  b  }  ;  c  }  =  Do  {  y’  <-  a  ;  Do  <  X  <-  b[y’/y]  ;  c  }  } 

To  implement  these  rules,  we  need  a  mechanism  for  pattern  matching  over  code.  Like 
all  MetaML  code,  the  result  of  the  monadic  interpreter  is  just  a  data  structure  so  this  is 
possible. 

Let  us  consider  a  simple  example.  Suppose  we  want  to  match  all  pieces  of  code  that 
are  of  the  form  <A  +  3>.  We  have  used  the  A  to  indicate  a  meta-variable  that  will  match 
any  piece  of  code.  We  cannot  put  a  variable  (e.g.  x)  here  because  <x+3>  is  just  a  piece 
code  and  not  a  pattern.  The  solution  to  indicating  a  meta- variable  in  a  pattern  is  to  use  an 
escajied  variable  at  level-1  in  the  pattern.  Thus  the  pattern  <'x  +  3>  matches  all  pieces 
of  code  that  have  this  "shape". 

riirori iinately.  this  scheme  is  not  always  sufficient  when  matching  against  code  with 
binding  conslructs  such  as  <fn  x  =>  x  +  1>.  We  would  like  to  construct  a  pattern  that 
matches  against  a  function  (or  other  binding  construct)  and  to  be  able  to  use  the  meta¬ 
variables  bound  inside  the  pattern  to  construct  a  transformation.  To  see  why  this  is 
problematic  consider  the  following  two  examples: 

1.  We  want  a  transformation  that  increments  the  body  of  an  integer  valued  function, 

such  that  when  applied  to  <fn  x  =>  x>  we  obtain  <fn  x  =>  x  +  1>,  and  when 

applied  to  <fn  y  =>  length  y>  we  obtain  <fn  y  =>  (length  y)  +  1>.  As  a  first 

approximation  we  try:  <fn  x  =>  A>  ->  <fn  x  =>  A  +1>.  This  looks  promising, 
but  what  would  happen  if  we  wrote:  <fn  x  =>  A>  =>  <fn  y  =>  A  +1>  instead? 
Now.  free  occurrences  of  x  in  A  no  longer  have  a  binding  site,  because  they  have 
been  spliced  into  a  context  where  y  is  the  bound  variable  instead  of  x. 

2.  We  want  a  transformation  that  doubles  the  argument  of  an  int  ->  int  function, 

such  that  when  applied  to  <fn  x  =>  x>  we  obtain  <fn  x  =>  x  +  x>  and  when 

applied  to  <f n  x  =>  y  +  x>  we  obtain  <fn  x  =>  y  +  (x  +  x)>.  The  problem  here 

is  that  in  the  pattern,  <fn  x  =>  A>,  there  is  no  way  to  express  that  A  may  have 
free  occurrences  of  x  inside,  and  that  our  transformation  needs  to  substitute  for 
those  free  occurrences. 

The  solution  is  to  use  a  higher-order  pattern.  Suppose  we  could  parameterize  A  on  x. 
This  makes  (A,,.)  not  a  meta-variable  with  type  code,  but  a  meta-variable  with  type  code 
lo  (ode.  Inside  a  pattern  on  the  left  hand  side  of  a  match  (  pat  =>  exp)  a  higher  order 
meta-variable  is  bound  to  a  function  when  it  is  successfully  matched  against  a  piece  of 
code.  On  the  right  hand  side  of  the  match,  when  this  meta-variable  is  used  (by  applying 
it  to  a  piece  of  code)  it  substitutes  all  occurrences  of  x  with  the  argument  it  was  applied 
to.  For  example  consider  the  table  below  showing  the  binding  of  the  higher  order  meta 
variable  Aj,  when  the  pattern  <fn  x  =>  A^  +  3>  is  matched  against  different  pieces  of 
code. 
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code  matched  against 


function  bound  to 


<fn  X  =>  X  +  3>  fn  X  =>  <  "x  > 

<fn  X  =>  (x  -  9)  +  3>  fn  X  =>  <  "x  -  9  > 

<fn  X  =>  (sin  x  +  x"2)  +3  fn  x  =>  <  sin  "x  +  ''x"2  > 

<fn  X  =>  X  +  1>  match  failure 

To  express  this  in  MetaML  we  use  the  convention  that  the  function  in  an  escaped 
application  (where  all  the  arguments  of  the  application  are  explicitly  bracketed  code) 
represents  a  higher  order  meta-variable.  Thus,  whenever  an  escaped  application  appears 
inside  a  pattern,  the  function  part  of  the  application  is  a  higher-order  meta- variable  and 
its  arguments  are  its  formal  parameters.  For  example:  “(g  <x>).  The  two  problematic 
examples  above  are  now  easily  expressed  as: 

<fn  X  =>  "(g  <x>)>  =>  <fn  y  =>  '(g  <y>)  +  1> 

<fn  X  =>  "(h  <x>)>  =>  <fn  z  =>  "(g  <2  +  z>)> 

Because  higher  order  meta- variables  may  appear  only  in  the  function  position  of  es¬ 
caped  applications,  and  the  arguments  of  these  escaped  applications  may  ojily  bt^  l>rackci.(‘d 
bound  variables  (like  <x>),  pattern-matching  and  unification  are  decidable  [16,  25]. 

We  now  possess  the  tools  to  present  the  monad-law  and  code-optimizing  MetaML- 
function  opt: 

fun  opt  <  Do  "st  {x  <~  "v  ;  return  *'st*  x  }  >  =  opt  v 

I  opt  w  as  <  Do  "st  {x  <-  Return  "st*  'e  ;  '(z  <x>)  >  = 

if  is.constant  e  then  opt  (z  e)  else  w 
I  opt  <  Do  "st  {x  <~  Do  “sf  {y  <-  "e  ;  "(f  <y>)}  ;  "(g  <x>)}  >  = 

opt  <Do  "St  iy*  <-  "e  ;  x'  <-  "(f  <y’>)  ;  "(g  <x’>)}> 

I  opt  X  =  map^code  opt  x  (*  traversal  through  the  code  ♦) 

Our  opt  function  implements  a  limited  form  of  the  left-id  monad  law.  We  do  not 
wish  to  duplicate  by  substitution  a  non-constant.  By  composing  this  optimization  with 
intGrprGt2  we  obtain  a  better  compiler.  Applying  this  compiler  to: 

Decleore  x  =  IBO  in 

Declare  y  =  200  in  while  x>0Do{x  :=X“1;  y  :=y“l} 

we  obtain  following  program: 

<Do  y.state 

{  a  <-  '/.push  150; 
b  <-  Xpush  200 ; 
c  <~  let  fun  loop  ()  = 

Do  y.state 

{  e  <-  y.read  1; 

f  <-  return  ‘/.state  (if  (e  %>  0)  then  1  else  0); 

if  (f  y.=  0) 

then  return  ‘/.state  () 
else  Do  ‘/.state 

{  g  <-  ‘/.read  1; 

h  <-  return  ‘/.state  (g  -  1); 
i  <-  '/.write  1  h; 
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j  <-  Xread  0; 

k  <-  return  '/.state  (j  -  1); 
1  <-  %write  0  k; 
loop  () 

} 

} 

in  loop  0  end 
m  <-  '/.pop; 

%pop 

» 

> 


The  optimizer  has  fully  sequentialized  the  code  using  the  bind-associativity  law,  and 
removed  all  superfluous  Return’s  using  the  unit-identity  laws.  Further  optimizations,  such 
as  arithmetic  simplification,  or  transformations  to  another  form,  such  as  assembly  code, 
could  be  implemented  in  the  same  fashion. 


5  Related  work 

Our  work  was  inspired  by  work  in  many  different  areas.  Derivation  of  compilers  from 
.specifications  and  the  use  of  action-semantics  [19,  23, 11, 22];  the  use  of  monads  to  structure 
programs  in  general  [18,  31,  26]  and  language  implementations  in  particular  [32,  27,  14]; 
staged  programming  [5,  6]  and  its  use  in  structuring  compilers  [29,  20, 4];  partial  evaluation 
[34,  17,  1,  3,  2,  9];  higher  order  abstract  syntax  and  pattern  matching  [16,  7] 

For  space  considerations  we  limit  detailed  discussion  to  the  following  areas. 

5.1  Monads  and  compilation 

Perhaps,  the  most  related  work  is  the  work  of  Sheng  Liang  and  his  thesis  advisor  Paul 
Hudak  [12,  13].  They  investigate  the  derivation  of  a  compiler  from  a  modular  monadic 
interpreter.  Our  work  is  a  continuation  of  their  effort  of  using  monads  as  a  standard 
compilation  mechanism.  However,  some  differences  remain: 

•  The  use  of  staging,  lead  us  at  an  early  step  in  the  development,  to  split  the  environ¬ 
ment  into  a  static  index  of  names  and  a  dynamic  stack  of  values.  This  allows  us  to 
avoid  the  use  of  an  environment  monad.  We  use  instead  an  state  transformer  monad 
ill  which  the  state  is  managed  like  a  stack.  Liang  uses  a  complicated  monad  which 
is  a  combination  of  an  environment  monad  and  a  state  transformer.  After  code  gen¬ 
eration  they  show  that  the  residual  code  due  to  the  environment  (the  lookups  of  the 
location  of  variables)  can  be  eliminated  using  axioms  of  the  non-standard  morphisms 
of  the  environment  monad.  Our  use  of  staging  allowed  us  to  do  the  lookups  in  the 
firs!  si  age  and  to  never  residualize  the  lookups  at  all. 

•  On  the  other  hand,  Liang’s  use  of  modular  language  components  is  an  advantage  we 
have  not  even  attempted  to  employ.  For  simplicity,  we  have  used  the  same  monad 
for  both  expressions  and  commands  while  Liang  uses  a  modular  approach  where 
each  feature  is  defined  independently  from  the  others.  Finally  all  the  features  are 
combined  by  a  monad  transformer.  To  do  this  it  is  necessary  to  lift  all  non-standard 
morphisms  through  the  transformer.  This  is  hard  and  not  completely  understood. 
We  may  try  to  duplicate  Liang’s  approach  in  future  work. 
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5.2  Staging  and  compilation 

In  his  thesis  Calculating  Compilers  [15]  Erik  Meijer  advocates  staging  a  compiler  by  using 
self  discipline.  Construct  a  compiler  by  building  it  as  the  composition  of  compile-time  and 
run-time  components.  A  critical  step  in  this  process  is  finding  a  representation  of  every 
source  language  construct  as  a  combination  of  (lower  level)  target  level  constructs.  By 
representing  both  source  and  target  languages  as  algebraic  datatypes,  say  source  and 
target,  induced  by  the  functors  S  and  T,  this  can  be  reduced  to  finding  a  polymorphic 
function  Trans,  which  for  all  a,  has  type  (Ta  ^  a)  (So-  — >•  cr),  a  so-called  algebra 
transformer. 

Let  the  semantic  domain  of  the  target  algebra  be  some  type  value.  If  the  seman¬ 
tic  meaning  function  for  the  target  language  M:  target  ->  value  can  be  expressed  as 
a  catamorphism  H  =  cata  phi  where  phi:T  value  ->  value,  we  can  lift  phi  into  an 
interpreter  for  the  source  language  by  applying  the  algebra  transformer  Trans.  Thus 
Trans  phi:S  value  ->  value  and  Interp  =  cata  (Trans  phi): source  ->  value.  A 
similar  construction  can  be  used  to  construct  the  compiler  Compiler: source  ->  target. 
Let  function  In:T  target  ->  taorget  be  the  injection  between  the  functor  T  and  its  in¬ 
duced  algebraic  datatype  target,  then  cata  (Trans  In)  :  source  ->  target  constructs 
the  compiler. 

The  limiting  factor  in  this  approach  is  finding  an  algebraic  datatype  target  to  encode 
the  target  language.  For  a  monadic  target  language,  it  is  not  known  how  to  do  this,  sinct' 
the  constructors  for  “unit”  and  “bind”  would  be  too  polymorphic  to  encode  in  an  algebraic 
datatype,  and  many  of  the  non-standard  morphisms  would  not  be  polymorphic  enough. 

By  staging  the  process  in  MetaML,  we  do  away  with  the  need  for  an  algebraic  datatype 
to  encode  the  target  language,  by  using  the  special  type  of  code  instead.  The  constructors 
of  the  target  algebra  are  simply  the  second  stage  representations  of  the  real  functions, 

5.3  Difference  between  staging  and  partial  evaluation 

Staged  programming  (S.P.)  is  closely  related  to  partial  evaluation  (P.E.).  We  list  what  we 
believe  are  the  salient  differences. 

•  S.P.  uses  explicit  annotations  while  P.E.  uses  implicit  annotations  placed  by  an 
automatic  binding  time  analysis. 

•  S.P.  gives  the  programmer  complete  control  over  what  residual  program  is  produced, 
while  the  residual  program  produced  by  P.E.  often  contains  surprises.  The  surprises 
are  caused  by  the  differences  between  what  the  programmer  knows  and  what  the 
binding  time  analysis  can  discover.  The  solution  to  this  mismatch  is  for  the  pro¬ 
grammer  to  restructure  his  program  using  “binding-time  improvements”  which  more 
closely  align  his  knowledge  and  the  capabilities  of  the  binding  time  analysis.  Of 
course  S.P.  is  not  completely  immune  to  these  difficulties,  but  the  staged  program¬ 
mer  must  be  fully  aware  of  the  staging  issues  before  he  writes  his  program.  'I'lie 
staged  type-system  is  a  great  advantage  here.  Nevertheless,  there  are  many  simple 
programs  where  automatic  binding  time  analysis  is  sufficient,  and  hand  staging  is 
simply  an  annoyance.  In  our  system  we  have  combined  the  advantages  of  both, 
allowing  a  simple  type-directed  binding  time  analysis  to  co-exist  with  the  manual 
staging  annotations.  An  analysis  of  this  co-existence  is  beyond  the  scope  of  this 
paper. 

•  S.P.  is  a  programming  language  feature.  It  exists  at  the  same  level  as  the  program. 
Here  the  algorithm  and  the  staging  are  developed  hand  in  hand.  There  are  no 
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additional  tools  or  processes,  and  users  learn  how  to  weave  the  staging  thought 
processes  into  their  problem  solving  techniques. 

•  S.P.  provides  a  complete,  unified,  typed  environment,  supporting  both  type  recon¬ 
struction  and  polymorphism  for  the  staged  constructs. 

6  The  Implementation 

Everything  you  have  seen  in  this  paper,  except  the  higher  order  pattern  matching  over 
code,  has  been  implemented  in  the  MetaML  implementation.  The  examples  are  actual 
runs  of  the  system. 

The  higher  order  pattern  matching  is  currently  under  development.  We  found  the 
normalizing  effect  of  the  monad  laws*  so  compelling  that  we  implemented  them  in  an 
ad-hoc  fashion  inside  the  MetaML  system. 

7  Conclusion 

We  have  shown  that  staging  programs  offers  an  exciting  new  programming  paradigm,  and 
reinforced  the  notion  that  staging  a  monadic  interpreter  into  compile-time  and  run-time 
compoiH'iiis  |)rovidps  a  direct  link  between  an  interpreter  and  a  compiler. 
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A  Proofs 


We  repeat  here  the  axiomatic  semantics  of  MetaML  [28].  For  the  sake  of  simplicity,  we 
omit  the  level-annotations. 

run  <vl>  =  vli  (run) 

"<e>  =  e  (escape) 

(A  X.  e)  V  =  eCx  :=  v]  (beta) 

The  (escape)  axiom  applies  only  at  level  one  (inside  exactly  one  bracket)  and  (run)  and 
(beta)  apply  only  at  level  0  (inside  no  brackets) . 

Lemma  1.  For  any  well-typed  expression:  <  ~e  >,  we  have  <  'e  >  =  e 

Proof.  Since  <  'e  >  is  well-typed,  e  must  evaluate  (if  it  terminates)  to  <v>.  Then  e  = 
<v>.  We  have 

<'’e>  *  <“<v»  replace  equals  by  equals 
=  <v>  By  escape  axiom 

=  e 

□ 


Lemma  2.  For  any  well-type  expression:  run  <f  e>,  we  have 
run  <f  e>  =  (run  <f>)  (run  <e>) . 

Proof.  Since  the  term  f  e  is  at  level  1,  the  only  possible  reduction  is  by  the  escape  axiom. 
Assume  <f>  and  <e>  evaluate  to  the  values  <f  1>  and  <el>  respectively.  Then  <f  e>  must 
evaluate  to  <f  1  el>  (since  at  level  1  we  cannot  do  a  beta-step).  Hence,  we  have  <e>  * 
<el>,  <f>  =  <f 1>, <f  e>  =  <f 1  el> 

run  <f  e>  =  run  <fl  el> 

=  (fl  el)  4, 

=  (fU)  (eU) 

*  (run  <fl>)  (run  <el>) 

=  (run  <f>)  (run  <e>) 

□ 


by  replacing  equals  by  equals 
by  run  axiom 
by  definition  of  | 
by  run  axiom 

by  replacing  equals  by  equals 


Lemma  3.  For  any  well-type  expression:  run  <A  x.e>,  we  have 
run  <Ax.e>  *  Ax.  (run  <e>). 

Proof.  The  proof  is  similar  to  the  two  lemmas  above.  □ 

A  consequence  of  the  previous  two  lemmas  is  that  rim  distributes  through  its  sub¬ 
expressions.  In  particular,  run  distributes  through  Do  and  let. 

run  <Do  {  xl  <-  el  ;  x2  <-  e2  ;  ...  ;  en  }  >  = 

Do  {  xl  <-  run  <el>  ;  x2  <-  run  <e2>  ;  . . .  ;  run  <en>  }  (run-Do) 

run  <  let  val  x  =  e  in  e2  >  =  (let  val  x  =  run  <el>  in  run  <e2>)  (run-Let) 

Proposition  1.  For  all  expressions  exp,  and  list  of  names  index; 
evall  exp  index  =  run  (eval2  exp  index) 
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Proof.  Induction  on  the  structure  of  exp. 
case  exp  of  Minus  (el,  e2) 

run  (eval2  (Minus (el, e2))  index  =  By  beta  axiom 

run  <Do  ms wo  {  a  <-  "  (eval2  el  index)  ; 

b  <-  "  (eval2  e2  index)  ; 

Return  ms wo  (a  -b)  >  =  by  (run-Do) 

Do  mswo  {  a  <-  run  <"(eval2  el  index)>  ; 

b  <-  run  <"(eval2  e2  index)>  ; 

run  <Return  mswo  (a-b)>  }  =  by  lemmal  (twice)  and  run  axiom 

Do  mswo  {  a  <-  run  (eval2  el  index)  ; 

b  <-  run  (eval2  e2  index)  ; 

Return  mswo  (a-b)  *  by  induction  hypothesis  (twice) 

Do  mswo  {  a  <-  evall  el  index)  ; 

b  <-  evall  e2  index)  ; 

Return  mswo  (a-b)  =  by  beta 

evall  (Minus(el ,e2))  index 

The  other  cases  are  similar.  □ 

Proposition  2.  For  all  commands  com  and  list  of  names  index; 

interpret  1  com  index  =  run  (interpret2  com  index} 

Proof.  By  induction  on  the  structure  of  com. 
case  com  of  While (e, body). 

run  (interpret2  (While(e,body))  index  *  By  beta 

run  <let  fun  loop  ()  = 

Do  mswo  {  V  <-  '(eval2  e  index); 
if  v=0 

then  Return  mswo  () 

else  Do  mswo  {  q  <-  '’(interpret2  body  index);  loop  ()} 

> 

in  loop  ()  end  >  *  by  run-Do  and  run-Let 


let  fun  loop  0  = 

Do  mswo  {  V  <-  run  <'(eval2  e  index)>; 
if  v=0 

then  run  <Return  mswo  ()> 

else  Do  mswo  {  q  <-  nin  <”(interpret2  body  index)>; 

run  <  loop  0  >} 


} 


in  run  <  loop  ()>  end 


=  By  Lemma  1  and  run  axiom 
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let  fun  loop  ()  = 

Do  mswo  {  V  <-  run  (eval2  e  index)>; 
if  v=0 

then  Return  mswo  () 

else  Do  mswo  {  q  <-  run  (interpret2  body  index) ; 

run  <  loop  ()  >} 


} 

in  run  <  loop  ()>  end 


=  By  induction  hypothesis  and  Proposition  1 


let  fun  loop  ()  = 

Do  mswo  {  V  <-  (evall  e  index) >; 
if  v=0 

then  Return  mswo  () 

else  Do  mswo  {  q  <-  interpretl  body  index);  run  <loop  ()>  } 

} 

in  run  <loop  ()>  end  =  By  run  axiom 

let  fun  loop  0  = 

Do  mswo  {  V  <-  (evall  e  index) >; 
if  v=0 

then  Return  mswo  () 

else  Do  mswo  {  q  <-  interpretl  body  index) ;  loop  ()  } 

} 

in  loop  0  end 


The  last  step  is  only  possible  because,  at  this  step  in  the  derivation,  there  are  no 
annotations  (in  particular  no  escapes)  in  the  body  of  the  function  loop,  thus  the  body  of 
loop  at  level  1  is  a  value,  and  hence  in  normal  form. 

The  other  cases  are  easier.  □ 
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Abstract 

Multi-staged  programming  provides  a  new  paradigm  for  constructing  efficient  solutions  to 
complex  problems.  Techniques  such  as  program  generation,  multi-level  partial  evaluation,  and 
run-time  code  generation  respond  to  the  need  for  general  purpose  solutions  which  do  not  pay 
run-time  interpretive  overheads.  This  paper  provides  a  foundation  for  the  formal  analysis  of  one 
such  system. 

We  introduce  a  multi-stage  language  and  present  its  axiomatic,  reduction,  and  natural  se¬ 
mantics.  Our  axiomatic  semantics  is  an  extension  of  the  call-by- value  A-calculus  with  staging 
constructs.  We  demonstrate  the  soundness  of  the  axiomatic  semantics  with  respect  to  the  nat¬ 
ural  semantics.  We  show  that  st aged-languages  can  “go  Wrong”  in  new  ways,  and  devise  a  type 
system  that  screens  out  such  programs.  Finally,  we  present  a  proof  of  the  soundness  of  this 
type  system  with  respect  to  the  reduction  semantics,  and  show  how  to  extend  this  result  to  the 
natural  semantics. 


1  Introduction 

Recently,  there  has  been  significant  interest  in  various  forms  of  multi-stage  computation,  including 
program  generation  [3,  26],  multi-level  partial  evaluation  [11,  12],  and  run-time  code  generation 
[1,  5,  4,  8,  9,  13,  15, 16,  22].  Such  techniques  combine  both  the  software  engineering  advantages  of 
general  purpose  systems  and  the  efficiency  of  specialized  ones. 

Because  such  systems  execute  generated  code  never  inspected  by  human  eyes  it  is  important  to 
use  formal  analysis  to  guarantee  properties  of  this  generated  code.  We  would  like  to  guarantee  stati¬ 
cally  that  a  program  generator  synthesizes  only  programs  with  properties  such  as:  type-correctness, 
global  references  only  to  names  in  scope,  and  local  names  which  do  not  inadvertently  hide  global 
references. 

In  previous  work  [25],  we  introduced  a  multi-stage  programming  language  called  MetaML.  In 
that  work  we  introduced  four  staging  annotations  to  control  the  order  of  evaluation  of  terms. 
We  argued  that  staged  programs  are  an  important  mechanism  for  constructing  general  purpose 
systems  with  the  eflSciency  of  specialized  ones,  and  addressed  engineering  issues  necessary  to  make 
such  systems  usable  by  programmers.  We  introduced  an  operational  semantics  and  a  type  system 
to  screen  out  bad  programs,  but  we  were  unable  to  prove  the  soundness  of  the  type  system. 

Further  investigation  revealed  important  subtleties  that  were  not  previously  apparent  to  us.  In 
this  paper,  we  report  on  work  rectifying  some  of  the  practical  limitations  of  our  previous  work. 
In  contrast  to  our  earlier  work  that  focused  on  implementations  and  problem  solving  using  multi- 
staged  programs,  this  paper  reports  on  a  more  abstract  treatment  of  MetaML’s  foundations.  The 
key  results  reported  in  this  paper  are  as  follows: 

1.  An  axiomatic  semantics  and  a  reduction  semantics  for  a  core  sub-language  of  MetaML. 

2.  A  characterization  of  the  additional  ways  in  which  a  staged  program  can  “go  Wrong”. 
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3.  A  type  system  to  screen  out  such  programs. 

4.  A  soundness  proof  for  the  type  system  with  respect  to  the  reduction  semantics  using  the 
syntactic  approach  to  type-soundness  of  Wright  and  Felliesen  [27]. 

5.  A  natural  semantics  that  chooses  the  order  in  which  rules  are  applied. 

6.  The  soundness  of  the  axiomatic  semantics  with  respect  to  the  natural  semantics. 

These  results  form  a  strong,  tightly-woven  foundation  which  gives  us  both  a  better  understand¬ 
ing  of  MetaML,  and  more  confidence  in  the  well-foundedness  of  the  multi-stage  paradigm.  The 
axiomatic  semantics  provides  us  with  an  equational  theory  for  formally  reasoning  about  the  equiv¬ 
alence  of  MetaML  programs,  and  the  reduction  semantics  is  an  abstract  characterization  of  the 
notion  of  staged  computation.  The  natural  semantics  provides  us  with  a  deterministic  strategy  for 
implementing  multi-stage  computation.  The  soundness  of  the  axiomatic  semantics  with  respect  to 
the  natural  semantics  formally  demonstrates  that  results  based  on  the  reductions  semantics  are 
also  applicable  to  our  implementation.  Finally,  formally  proving  the  soundness  of  the  type  system 
with  respect  to  the  reduction  semantics  ensures  to  us  that  well-typed  programs  are  well-behaved. 

1.1  What  are  Staged  Programs  All  About? 

In  staging  a  program,  the  user  has  control  over  the  order  of  evaluation  of  terms.  This  is  done 
by  using  staging  annotations.  In  MetaML  the  staging  annotations  are  Brackets  <>,  Escape  "  and 
rxm.  An  expression  <e>  defers  the  computation  of  c;  "e  splices  the  deferred  expression  obtained  by 
evaluating  e  into  the  body  of  a  surrounding  Bracketed  expression;  and  run  e  evaluates  e  to  obtain 
a  deferred  expression,  and  then  evaluates  this  deferred  expression.  It  is  important  to  note  that  is 
only  legal  within  lexically  enclosing  Brackets.  To  illustrate,  consider  the  script  of  a  small  MetaML 
session  below: 

-|  val  pair  =  (3+4,<3+4>); 

val  pair  =  (7,<3+4>)  :  (int  *  <int>) 

-I  fun  f  (x,y)  =  <  8  -  "y  >; 

val  f  *  fn  :  (’a  *  <int>)  ->  <int> 

-|  val  code  =  f  pair; 

val  code  =  <8  -  (3+4)>  :  <int> 

- I  run  code ; 
val  it  =  1  :  int 

The  first  declaration  defines  a  variable  pair.  The  first  component  of  the  pair  is  evaluated,  but  the 
evaluation  of  the  second  component  is  deferred  by  the  Brackets.  Brackets  in  types  such  as  <int> 
are  read  “Code  of  int”,  and  distinguish  values  such  as  <3+4>  from  values  such  as  7.  The  second 
declaration  illustrates  that  code  can  be  abstracted  over,  and  that  it  can  be  spliced  into  a  larger 
piece  of  code.  The  third  declaration  applies  the  function  f  to  pair  performing  the  actual  splicing. 
And  the  last  declaration  evaluates  this  deferred  piece  of  code. 

To  give  a  brief  feel  for  how  MetaML  is  used  to  construct  larger  pieces  of  code  at  run-time 
consider: 

-I  fun  mult  X  n  =  if  n=0  then  <1>  else  <  "x  *  "(mult  x  (n-1))  >; 
val  mult  =  fn  :  <int>  ->  int  ->  <int> 
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-|  val  cube  =  <fn  y  =>  "(mult  <y>  3)>; 

val  cube  =  <fn  a  ®>  a  *  (a  *  (a  *  1))>  :  <int  ->  int> 

-|  fun  exponent  n  =  <fn  y  =>  "(mult  <y>  n)>; 
val  exponent  =  fn  :  int  ->  <int  ->  int> 

The  function  mult,  given  an  integer  piece  of  code  x  and  an  integer  n,  produces  a  piece  of  code  that 
is  an  n-way  product  of  x.  This  can  be  used  to  construct  the  code  of  a  function  that  performs  the 
cube  operation,  or  generalized  to  a  generator  for  producing  an  exponentiation  function  from  a  given 
exponent  n.  Note  how  the  looping  overhead  has  been  removed  from  the  generated  code.  This  is  the 
purpose  of  program  staging  and  it  can  be  highly  effective  as  discussed  elsewhere  [4, 10, 13, 17, 22,  25]. 

In  this  paper  we  move  away  from  how  staged  languages  are  used  and  address  their  foundations. 

2  The  A-R  Language 

The  A-R  language  represents  the  core  of  MetaML.  It  has  the  following  syntax: 

e  :=  i  \  X  \  ee  \  Xx.e  |  <e>  |  ~e  |  run  e 

which  includes  the  normal  constructs  of  the  A-calculus,  integer  constants,  and  the  three  axlditional 
staging  constructs. 

To  define  the  semantics  of  Escape,  which  is  dependent  on  the  surrounding  context,  we  choose  to 
explicitly  annotate  all  terms  with  their  level.  The  level  of  a  term  is  the  number  of  Brackets  minus 
the  number  of  Escapes  surrounding  that  term.  We  define  level-annotated  terms  as  follows: 

a°  :=  I  a;°  |  (a°a°)°  |  (Aa;.a°)°  |  <a^>°  |  (run 

^n+i  ._  j-n+i  I  ^n+i  |  ^^n+1  ^n+i)n+i  |  (Aa;.a"+i)"+i  |  <a"+2>n+i  |  ("0")"+^  |  (run 

Note  that  Escape  never  appears  at  level  0  in  a  level-annotated  term.  We  define  a  A-R  program 
as  a  closed  term  aP.  Hence,  example  programs  are  (Ax.x°)°  and  «((Aa;.(a;^  5^)^>^>®. 

2.1  Values 

It  is  instructive  to  think  of  values  as  the  set  of  terms  we  consider  to  be  acceptable  results  from  a 
computation.  Values  are  defined  as  follows: 

:=  I  x°  I  {Xx.ay  \  <v^>^ 

:=  I  I  I  (Ar.u^)^  |  <v^>^  \  (run 

yn+2  ._  jn+2  |  ^n+2  |  ^^n+2  j^ra+2^n+2  |  (;^a;.u”"''^)"+^  |  <u"+3>”+2  |  (•'yn+l^n+2  |  yn+2jn+2 

The  set  of  values  for  A-R  has  three  notable  points.  First,  values  can  be  bracketed  expressions.  This 
means  that  computations  can  return  pieces  of  code  representing  other  programs.  Second,  values 
can  contain  applications  such  as  (Ay.y^)^  (Ax.r^)*.  Third,  there  are  no  level  1  Escapes  in  values. 

We  take  advantage  of  this  important  property  of  values  in  many  proofs  and  propositions  in  our 
present  work. 

Because  each  rule  in  the  inductive  definition  above  is  an  instance  of  one  of  the  rules  given  in 
the  inductive  definition  for  level-annotated  terms  it  is  easy  to  show  that  values  are  a  subset  of 
level-annotated  terms. 

2.2  Contexts 

We  generalize  the  notion  of  contexts  [2]  to  a  notion  of  annotated  contexts: 
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J. 


c®  :=  []0  I  (c®  a0)0  I  (a°  |  (Aa:.c°)0  |  <c^>o  1  (run  0°)° 

^n+l  jjn+l  I  (^n+l  ^n+l^n+l  I  (^n+l  ^n+l^n+l  I  (\2._^n+l)n+l  I 

<c”+2>"+i  I  (''c”)"+^|  (run  c”+^)”+^ 

where  Q  is  a  hole.  When  instantiating  an  annotated  context  c”[]’”  to  a  term  e”*  we  write  c"[e”*]. 


2.3  Promotion  and  Demotion 

The  axioms  of  MetaML  remove  Brackets  from  level-annotated  terms.  To  maintain  the  consistency 
of  the  level-annotations  we  need  an  inductive  definition  for  incrementing  and  decrementing  all 
annotations  on  a  term.  We  call  these  operations  promotion  and  demotion. 


Promotion 


r"  t 
(oi  «2)"  t 
(Aa:.a)"  f 

<a>"  t 

(~^)n-|-l  j 
(run  t 

I 


=  X 


n+l 


(ai  t  02 
(Aar.a  t)"+^ 
<a  t 

("a  t)”+2 


=  t 


(run  a  '|')"+^ 

n+1 


X 


Demotion 

I 


(0l  02)"+^  i 
(Ax.a)”"*"^  I 
4 


<o> 


,n*f  1 


X’ 


(Oi  I  02  4-)” 
(Ax.o  4-)” 
<a4.>" 


(''a)"+2  4. 
(run  o)"+i  4, 
4. 


{"a  ;)"+‘ 
(run  0  4.)” 


Promotion  is  a  total  function  over  level-annotated  terms  and  is  defined  by  a  simple  inductive 
definition.  Demotion  is  a  partial  function  over  level-annotated  terms.  Demotion  is  undefined  on 
terms  Escaped  at  level  1,  and  on  level  0  terms  in  general. 

An  important  property  of  demotion  is  that  while  it  is  partial  over  level-annotated  terms  it  is 
total  over  values.  Proof  of  this  is  a  simple  induction  on  the  structure  of  values. 


2A  Substitution 

The  definition  of  substitution  is  standard  for  the  most  part.  In  this  paper  we  are  concerned  only 
with  the  substitution  of  values  for  variables.  When  the  level  of  a  value  is  different  from  the  level 
of  the  term  in  which  it  is  being  substituted,  promotion  (or  demotion,  whichever  is  appropriate)  is 
used  to  correct  the  level  of  the  subterm. 


i”[x” 

=  u”] 

x"[x” 

=  t;”] 

= 

u" 

y"[x” 

=  «”] 

ytl 

x^y 

(ai  a2)”[a;” 

=  «”] 

= 

((ai[x”  :=  «”])  (a2[x"  :=  u”]))” 

(Ax.ai)”[x” 

=  u”] 

= 

(Ax.ai)" 

(Ay.ai)”[x” 

=  u”] 

= 

(Ay'.(ai[2/”  :=  j/'"][x”  :=  «”]))” 

y' ^FV{v^),  y' ^FV{ai)  x  y 

<ai>”[x” 

=  u”] 

<ai[x”+^  :=  v”  t]>” 

•ai)"+^[x”+i  := 

yn+lj 

= 

("(aifx”  :=  1;”+^  4-]))”^^ 

(run  ai)"[x”  ; 

;=  u"] 

= 

(run  (ai[x  :=  u"]))” 

This  function  is  total  because  both  promotion  and  demotion  are  total  over  values.  A  richer  no¬ 
tion  of  demotion  is  need  to  perform  substitution  of  a  variable  by  any  expression.  This  generalization 
is  beyond  the  scope  of  this  paper. 

2*5  Axiomatization  and  Reduction  Semantics  of  A-R 

The  axiomatic  semantics  describes  an  equivalence  between  two  level-annotated  terms.  Axioms  can 
be  thought  of  as  pattern-based  equivalence  rules,  and  are  applicable  in  a  context-independent  way 
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to  any  subterm  that  they  match.  The  three  axioms  we  will  introduce  can  each  be  given  a  natural 
orientation  or  direction,  reducing  “bigger”  terms  to  “smaller”  terms.  This  provides  a  reduction 
semantics. 


Axiomatic 

Reduction 

((Ax.e")"u"')”  =  e”[x  :=  u”] 
(run  =  t;"+i  | 

^-'<gn+l>n)n+l  _ 

((Ax.e”)"t;’^)"  e®[x  :=  u”] 

(run  4. 

We  write  A-R  h  M  =  when  M  =  N  is  provable  by  the  above  axioms  and  the  classical  inference 
rules  of  an  equational  theory,  and  we  write  for  the  reflexive,  transitive,  context  closure  of  — y. 

Theorem  1  (Confluence),  The  reduction  semantics  is  confluent 

Proof  Using  a  notion  of  parallel  reduction  and  a  Strip  Lemma,  following  closely  the  development 
in  [2,  pages  277-283].  □ 

Corollary  2  (Church- Rosser).  The  axiomatic  semantics  is  Church- Rosser, 

3  Faulty  Terms 

Under  the  reduction  semantics,  when  a  term  has  been  sufficiently  reduced,  we  would  like  such  a 
term  to  be  a  value,  but  this  is  not  always  the  case.  If  no  rules  apply,  and  the  term  is  not  a  value, 
we  say  that  such  a  term  is  stuck  [27].  There  are  four  contexts  in  which  such  terms  can  arise: 

1.  A  non- A  value  in  a  function  position  in  an  application  (at  level  0).  This  is  the  familiar  form 
of  undesirable  behavior  arising  whenever  the  pure  A-calculus  is  extended  with  constants.  For 
example,  (<5*>°  3°)°  is  stuck  because  <5^>°  is  a  piece  of  code,  not  a  A-abstraction.  This  term 
is  not  a  value  and  contains  no  redex. 

2.  A  variable  appears  at  a  level  lower  than  the  level  at  which  it  was  bound.  This  is  the  key, 
distinguishing  form  of  undesirable  behavior  in  multi-stage  computation  [25].  For  example: 
<(Ax."(a:®)^)^>°  is  stuck  since  x  is  used  at  level  0  but  bound  at  level  1. 

3.  A  non-Bracket  value  is  the  argument  to  Run.  For  example:  (run  7°)°  is  stuck  since  7®  is  an 
integer  and  not  a  piece  of  code. 

4.  A  non-Bracket  value  is  the  argument  to  Escape.  For  example:  <(4^  -t-  ''(7°)*)^>® 

We  wish  to  consider  as  faulty,  terms  in  the  form  above.  We  will  show  that  if  a  term  is  typable, 
then  it  is  not  faulty,  and  neither  can  it  reduce  to  a  faulty  term.  We  formalize  this  notion  in  the 
next  sections. 

We  can  now  present  the  following  formal  specification  for  the  set  of  faulty  terms  F: 

1.  c[((<e”"''^>)"  e')"]  G  F  Non-A  terms  in  an  application  like:  (5®  3®)®  and  (<5^>^  3^)^ 

c[(i"  e')”]  e  F 

2.  c[(Aa;.c'[a;"])'"]  €  F  where  m  >  n.  Variables  at  too  low  a  level  like:  <(Aa:."(a;®)^)^>® 

3.  c[(run  (As.e)”)”]  €  F  Non-Bracket  in  Run  like:  (run  (A®. a:)®)®  and  (run  4®)^ 
c[(run  *•")"]  e  F 

4.  c[(“'(Ax.e)")”'*'^]  €  F  Non-Bracket  in  Escape  like:  <(4^-l-~((Ax.x)®)^)^>®  and  <(4^-1- '"(5^)^)^>^ 

c[(-(in))n+l]  g  p 
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The  success  of  our  specification  of  faulty  expressions  depends  on  whether  they  help  us  char¬ 
acterize  the  behavior  of  our  reduction  semantics.  The  following  lemma  is  an  example  of  such  a 
characterization,  and  is  needed  for  our  proof  of  type  soundness. 

Lemma  3  (Uniform  Evaluation).  Let  e”  be  a  closed  term.  If  e^  is  not  faulty  then  either  it  is  a 
value  or  it  contains  a  redex. 

Proof:  By  induction  on  the  structure  of  e". 

4  Type  System 

The  main  obstacle  to  defining  a  sound  type  system  for  our  language  is  the  interaction  between 
Run  and  Escape.  While  this  is  problematic,  it  adds  significantly  to  the  expressiveness  of  a  staged 
language  [23],  so  it  is  worthwhile  overcoming  the  difficulty.  The  problem  is  that  Escape  allows 
Run  to  appear  inside  a  Bracketed  A-abstraction,  and  it  is  possible  for  Run  to  “drop”  that  A-bound 
variable  to  a  level  lower  than  the  level  at  which  it  is  bound.  The  following  example  illustrates  the 
phenomenon: 

<(Ax.(-(run  (Aa;.(-x°)^)^ 

To  avoid  this  problem,  for  each  A-abstraction  we  need  to  count  the  number  of  surrounding  Runs 
for  each  occurrence  of  its  bound  variable  (here  in  its  body.  We  use  this  count  to  check  that 
there  are  enough  Brackets  around  each  formal  parameter  to  execute  all  surrounding  Runs  without 
leading  to  a  faulty  term. 

The  type  system  for  A-R  is  defined  by  a  judgment  A  h  e”  :  r,  m,  where  is  our  well-typed 
expression,  r  is  the  type  of  the  expression,  m  is  the  number  of  the  surrounding  Run  annotations 
of  and  A  is  the  environment  assigning  types  to  term  variables. 


Syntax 

types  T 

type  assignments  A 
judgments  J 

Type  System 


A(a?)  =  (r,  j)*  i  +  m  <  n  -f  j 

- - — - - Var 

A  h  ar”  :  r,  m 


A  h  :  <r>,m-|- 1 
A  h  (run  e”)”  :  r,  m 

Ahe"+^  :r,m 
A  h  :  <r>, m 


Run 

Bra 


A  h  €2  •  7“',  m  Ah  r,  m 

A  h  (ej  ej )”  •  ^ 


App 


r  — ^  r  I  <r>  |  int 
A  h  i  :  r,  m 


A  h  2”  :  int,  m 

A  h  e”  :  <r>, m 

^ — n - Esc 

A  h  (  :  r,  m 

{x  (r'j  m)” ;  A)  h  e”  :  r,  m 

— ^ ^ - Lam 

A  h  (Ax.e”)”  :  T^m 


The  type  system  employs  a  number  of  mechanisms  to  reject  terms  that  either  are,  or  can 
reduce  to  faulty  terms.  The  App  rule  has  the  standard  role,  and  rejects  non-functions  applied  to 
arguments. 

The  Escape  and  Run  rules  require  that  their  operand  must  have  type  Code.  This  means 
terms  such  as  run  5  and  <Ax."'5>  are  rejected.  But  while  this  restriction  in  the  Escape  and  Run 
rules  rejects  faulty  terms,  it  is  not  enough  to  reject  all  terms  that  can  be  reduced  to  faulty  terms. 
The  first  example  of  such  a  term  is  <Aa:."(run  <a;>)>  which  would  be  typable  if  we  use  only  the 
restrictions  discussed  above,  but  reduces  to  the  term  <Aa:."'a:>  which  would  not  be  typable.  The 
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second  examples  involves  an  application  {Xf.<Xx.“(f<x>)>){Xx.Tmi  x)  which  would  also  be  typable, 
but  reduces  to  <Xx.“x>.  To  reject  such  terms  we  need  the  Var  rule. 

The  Var  rule  is  instrumented  with  the  condition  i  +  m  <  n  +  j.  Here  i  is  the  number  of 
Bracket’s  surrounding  the  A-abstraction  where  the  variable  was  bound,  m  is  the  number  of  Runs 
surrounding  this  occurence  of  the  variable,  n  is  the  number  of  Brackets  surrounding  this  occurence 
of  the  variable,  and  j  is  the  number  of  Runs  surrounding  the  A-abstraction  where  it  was  bound. 
This  ensures  that  every  variable  has  more  Brackets  than  Runs  surrounding  it. 

In  previous  work,  we  have  attempted  to  avoid  these  two  kinds  of  problems  using  two  distinct 
mechanisms:  First,  the  argument  of  Run  cannot  contain  free  variables,  and  second,  we  prohibit  the 
A-abstraction  of  Run.  We  used  unbound  polymorphic  type  variable  names  in  a  scheme  similar  to 
that  devised  by  Launch  bury  and  Peyton  Jones  for  ensuring  the  safety  of  state  in  Haskel  [14].  It 
turns  out  that  not  allowing  any  free  variables  is  too  strong,  and  that  using  polymorphism  was  too 
weak.  It  is  better  to  simply  take  account  of  the  number  of  surrounding  occurrences  of  Run  in  the 
Var  rule.  This  way  we  ensure  that  if  Run  is  ever  in  a  A-abstraction,  it  can  only  strip  away  Brackets 
that  are  explicitly  apparent  in  that  A-abstraction. 

5  Type  Soundness  of  the  Reduction  Semantics 

The  type  soundness  proof  closely  follows  the  subject  reduction  proofs  of  Wright  and  Felliesen  [27]. 
Once  the  reduction  semantics  and  type  system  have  been  defined,  the  syntactic  type  soundness 
proof  proceeds  as  follows: 

1.  Show  that  reduction  in  the  standard  reduction  semantics  preserves  typing.  This  is  called 
subject  reduction. 

2.  Show  that  faulty  terms  are  not  typable. 

If  programs  are  well-typed,  then  the  two  results  above  can  be  used  as  follows:  By  (1),  evaluation 
of  a  well- typed  program  will  only  produce  well- typed  terms.  By  Lemma  3,  every  such  term  is  either 
faulty,  or  a  value,  or  contains  a  redex.  The  first  case  is  impossible  by  (2).  Thus  the  program  either 
reduces  to  a  well-typed  value  or  it  diverges. 

5.1  Subject  Reduction 

The  Subject  Reduction  Lemma  states  that  a  well-typed  term  remains  well-typed  under  reduction. 
The  proof  relies  on  the  Demotion,  Promotion  and  Substitution  Type  Preservation  Lemmas.  First 
we  need  to  introduce  two  operations  on  the  environment  assigning  types  to  term  variables: 

^  t(9,p)  (a;)  =  (r,i-|-  iff  A(a;)  =  (r,  j)’ 

^  'l'(9,p)  (^)  =  (’■’JT  iff  ^(a:)  =  {rJ  +  qY'^^ 

These  two  operations  map  environments  to  environments.  They  are  needed  in  the  Promotion  and 
Demotion  Lemmas.  They  provide  an  environment  necessary  to  derive  a  valid  judgement  for  a 
promoted  or  demoted  well-typed  value.  Notice  that  we  have  the  following  two  properties: 

t(g,p))  t(ij)=  ^  t(g+i,p+i)  t(g+i,p+i))  ^  t(g,p) 

We  writeu  and  v  4*^,  respectively,  for  an  abbreviation  of  p  applications  of  t  and  |  to  u.  Note 
that  this  operation  is  different  from  t(g,p)  and  which  is  a  function  on  environments  assigning 
types  to  term  variables. 

Lemma  4  (Demotion).  If  q  <  p  and  A2  i(qp)  is  defined  and  A1UA2  H  :r,  m  +  9  then 
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Proof.  By  induction  on  the  structure  of  u”+p.  We  develop  only  the  variable  case 
There  are  only  two  possible  sub-cases,  which  are: 

Ai(a;)  =  (r,j)‘  i  +  mj- q<n-\r  j 

(Ai  U  A2)  I-  a:"+P  :  r,  m  +  g  v 

By  hypothesis  q  <p  implies  m  +  i  <  n->r  j.  Hence  (Ai  U  (A2  4-(,,p)))  t"  r,  m. 

A2(a:)  =  (r,  j  -f  g)*+P  i  +  m  +  2q<n  +  j  +  2p 
(Ai  U  A2)  1“  x"+P  :  r,  m  +  g  ( 

Similar  to  the  above  sub-case.  □ 

Lemma  5  (Promotion).  Let  q  <  p.  //  A  1-  u”  :  r,  m  then  Ai  U  (A2  t(g,p))  T,m+q. 

Proof.  By  induction  on  u".  □ 

Lemma  6  (Substitution).  If  j  <  m  and  Ai  U  (a;  i->  (r',j)’;  A2)  t-  e"  :  r,m  and  Ai  h  :  t'J 
then  one  of  the  following  three  judgments  holds. 

1.  Ai  h  e”[x”  :=  u*  :  T,m  if  n  >  i. 

2.  Ai  h  e”[a;”  :=  u’  4-'“"]  :  T,m  if  n  <  i 

3.  Ai  I-  e"[a:"  :=  u"]  :  r,  m,  otherwise 

Proof.  By  induction  on  the  structure  e”.  If  e”  =  a:"  then  we  have: 

A(a:)  =  (r,j)‘  m  +  i<n  +  j 
Ai  U  (a;  (r,  j)*;  A2)  h  a;”  :  r,  m 

•  If  n  <  i  and  by  the  hypothesis  j  <  m  then  m  +  i  >  n  +  j.  Hence  Aj  U  (a;  i->  (r,  j)*;  A2)  b 

:T,m  cannot  be  typable. 

•  if  n  >  i  then  m  —  j  <  n  —  i  and  the  Promotion  Lemma  5  applies. 

•  i  =  n  and  by  hypothesis  j  <  m  and  m+i  <  n+j  then  j  =  m.  Then,  Aj  h  e”[a;"  :=  u"]  :  r,  m. 

□ 

Corollary  7  (/3  Rule).  ^  A  1-  ((Aa;.e”)"  u”)"  :  r,  m  then  A  h  e"[x”  :=  u”]  :  r,  m. 

Lemma  8  (Escape  Rule).  //Ah  ("<e”+*>”)”  :  r,  m  then  A  h  e”  :  r,  m. 

Proof.  Straightforward  from  the  type  system.  □ 

Lemma  9  (Run  Rule).  //Ah  (run  <t;'i+i>")"  ;  r,  nr  then  A  h  4.:  r,  m. 

Proof.  If  A  h  (run  <u”+^>")"  :  r,  m  then  A  h  :  r,  m-|- 1  is  valid.  By  Demotion  Lemma  4, 
A  h  4'-  ■T)  is  valid.  □ 

Proposition  10.  If  A  h  e"  :  t,  m  and  cj  ->  A  h  :  r,  m. 

Proof.  By  induction  on  the  structure  of  e".  If  the  rewrite  is  at  the  root  then  use  Lemmas  8  and  9, 
and  Corollary  7.  If  e"  contains  a  redex  then  apply  induction  hypothesis.  □ 

Proposition  11  (Subject  Reduction).  If  A  h  e"  :  r,m  and  e"  Cj  then  E  A  h  Cj  :  r,  m. 

Proof.  By  induction  on  the  length  of  the  derivation.  □ 
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5.2  Faulty  Terms 

Lemma  12  (Faulty  Terms  are  Not  Typable).  If  e  £  F  then  there  is  no  A,t,a  such  that 
A  h  e  :  t,  a. 

Proof.  By  case  analysis  over  the  structure  of  e.  Let  e  =  ci[(Ax.C2[a;”])*]  such  that  n  <  i,  that  is, 
i  =  n  +  ki  +  1.  Assume  that  A  h  e  :  r,  m.  This  implies  that  x  >~¥  (r',j)®A'  h  x"  ;  r',p.  This  means 
that  i  +  p  <  n  +  j.  Because  p  =  j  +  k2  then  j  <  p.  This  implies  that  n  +  k  +  l  +  l  +  j  +  k2  <  n+j 
which  is  impossible.  The  other  cases  are  straight-forward.  d 

6  Natural  Semantics 

In  previous  work,  we  defined  core  MetaML  by  a  natural  semantics  [25].  While  this  style  of  presen¬ 
tation  is  closer  to  the  implementation  of  MetaML  than  the  reduction  semantics  presented  in  this 
paper,  it  is  more  complex.  We  have  found  that  it  was  easier  to  prove  type  soundness  first  with 
respect  to  the  reduction  semantics,  and  then  to  extend  this  result  to  the  natural  semantics. 

In  this  paper,  we  present  a  more  concise  natural  semantics  for  MetaML  than  the  one  we  have 
presented  in  previous  work  [25]: 


^1°  (Ax.e°)^  62°  ^  (e°[x  :=  v°])  ^  11° 

gO  ^  <„!>'> 

(Ax-eO)®  M-  (Ax.eO)® 

(ej  62)°  ^  ^2° 

-(gO)!  ^ 

ei°  ^  (v}  |)°  ^  t>2° 

ei^+i  ^  63"+^  62"+^  64"+^ 

(run  ei)°  V2° 

(e?+i  -4 

ei”+i  M-  62"+! 

ei”+i  €2”+^ 

.(gn+l)n+2^.(^„+l)n42 

ei"+‘  62"+^ 

(run  (run 

c_). 

A  key  property  of  this  presentation  is  that  it  avoids  the  explicit  use  of  a  gensym  or  newname 
function  for  renaming  abstractions  at  levels  greater  than  zero.  This  improvement  avoids  the  prob¬ 
lems  that  Moggi  points  out  regarding  the  use  of  such  stateful  functions  in  defining  the  semantics 
of  two-level  languages  [18]. 

Now  we  move  on  to  present  some  fundamental  results  about  the  untyped  A-R  language,  and 
use  these  results,  in  addition  to  the  soundness  of  the  type  system  with  respect  to  the  reduction 
semantics,  to  prove  the  soundness  of  the  type  system  with  respect  to  the  natural  semantics. 

We  say  that  two  terms  ei  and  e2  are  observationally  equivalent,  written  ei  ~  62,  if  for  any 
context  c[]  such  that  both  c[ei]  and  c[e2]  are  closed,  then  c\eff  ^  if  and  only  if  c[e2]°  V2°, 

and  Vi  =  i^  if  and  only  if  u®  =  when  both  relations  are  defined. 

Lemma  13.  If  e'^  u"  then  e"  v”. 

Proof.  By  induction  on  the  proof  tree  for  e”  u".  □ 

Lemma  14.  If  e  v  then  e^  v' . 

Proof.  This  proof  requires  a  Standardization  Theorem  along  the  lines  of  Plotkin  [20],  but  one 
extended  to  deal  with  Brackets,  Escape  and  Run.  We  omit  the  details  for  the  sake  of  brevity. 
Please  see  the  technical  report  for  the  full  details  [24].  □ 
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Corollary  15.  There  exists  a  value  v  such  that  X-R  \-  e  =  v  if  and  only  if  e  v' . 

Proof.  Consequence  of  Lemmas  14  and  13.  □ 

Theorem  16  (Soundness  of  Axiomatic  Semantics).  If  X-R  V-  ei  =  63  then  ei  ~  €3. 

Proof.  If  ei  ui  then  by  Corollary  15  A-R  h  Ci  =  Ui.  Hence,  A-R  h  63  =  ui.  By  Corollary  15, 
there  exists  a  value  V2  such  that  €3  V2.  By  Lemma  13,  A-R  h  uj  =  U3.  Since  the  axiomatic 

semantics  is  Church-Rosser,  we  have  vi  v  and  V3  v.  Thus,  ei  ~  63  □ 

We  define  undesirable  behavior  in  the  natural  semantics  in  the  classical  manner:  we  introduce  a 
new  “value”  Wrong,  written  T,  and  a  set  of  rules  complementing  the  rules  of  the  natural  semantics, 
and  returning  T  in  all  these  new  cases.  We  call  the  combination  of  these  two  sets  of  rules  the 
augmented  natural  semantics,  and  denote  it  by  . 

Lemma  17.  If  e  ^  T  then  e  /  and  f  £  F  and  f  ^v. 

Proof.  By  induction  on  the  proof  tree  of  the  augmented  natural  semantics  ^  .  □ 

Theorem  18  (Type  Soundness).  If  A  I-  e  :  r,  m  and  e^  e'  then  e' 

Proof.  We  prove  the  contrapositive.  If  e'  =  T  and  e  T  then  by  Lemma  17,  e  — ^  /.  Hence  by 
type  soundness  of  the  reduction  semantics,  e  is  not  typable.  □ 

7  Related  Work 

Multi-stage  programming  techniques  have  been  used  in  a  wide  variety  of  settings,  including  run-time 
program  generation  in  ML  [17],  run-time  specialization  of  C  programs  [5,  4,  21,  9],  and  advanced 
dynamic  compilation  for  C  programs  [Ij. 

Nielson  and  Nielson  present  a  seminal  detailed  study  into  a  two-level  functional  programming 
language  [19].  This  language  was  developed  for  studying  code  generation.  Davies  and  Pfenning 
show  that  a  generalization  of  this  language  to  a  multi-level  language  called  A°  gives  rise  to  a  type 
system  very  related  to  a  modal  logic,  and  that  this  type  system  is  equivalent  to  the  binding-time 
analysis  of  Nielson  and  Nielson  [7].  Intuitively,  A*^  provides  a  natural  framework  where  LISP’s 
quote  and  eval  can  be  present  in  a  language.  The  semantics  of  our  Bracket  and  Run  correspond 
closely  to  those  of  quote  and  eval,  respectively. 

Gliick  and  Jprgensen  study  partial  evaluation  in  the  generalized  context  where  inputs  can  arrive 
at  an  arbitrary  number  of  times  rather  than  just  specialization-time  and  run-time  [12].  They 
also  demonstrate  that  binding-time  analysis  in  a  multi-level  setting  can  be  done  with  efficiency 
comparable  to  that  of  two-level  binding  time  analysis.  Our  notion  of  level  is  very  similar  to  that 
used  by  Gliick  and  j0rgensen[lO,  11]. 

Davies  extended  the  Curry-Howard  isomorphism  to  a  relation  between  modal  logic  and  the  type 
system  for  a  multi-level  language  [6].  Intuitively,  A^  provide  a  good  framework  for  formalizing 
the  presence  of  quote  and  quasi-quote  in  a  language.  The  semantics  of  our  Bracket  and  Escape 
correspond  closely  to  those  of  quote  and  quasi-quote,  respectively.  Previous  attempts  to  combine 
the  A°  and  A*^  systems  have  not  been  successful  [7,  6,  25].  To  our  knowledge,  our  work  is  the  first 
successful  attempt  to  define  a  sound  type  system  combining  Brackets,  Escape  and  Run  in  the  same 
language. 

Moggi  advocates  a  categorical  approach  to  two-level  languages,  and  and  uses  indexed  categories 
to  develop  models  for  two  languages  similar  to  A°  and  A*^  [18].  He  points  out  that  two- level 
languages  generally  have  not  been  presented  along  with  an  equational  calculus.  Our  paper  has 
eliminated  this  problem  for  MetaML,  and  to  our  knowledge,  is  the  first  presentation  of  a  multi¬ 
level  language  using  axiomatic  and  reductions  semantics. 
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8  Conclusion 


In  this  paper,  we  have  presented  an  axiomatic  and  reduction  semantics  for  a  language  with  three 
staging  constructs:  Brackets,  Escape,  and  Run.  Arriving  at  the  axiomatic  and  reduction  semantics 
was  of  great  value  to  enhancing  our  understanding  of  the  language.  In  particular,  it  helped  us  to 
formalize  an  accurate  syntactic  characterization  of  faulty  terms  for  this  language.  This  character¬ 
ization  played  a  crucial  role  in  leading  us  to  the  type  system  presented  here.  Finally,  it  is  useful 
to  note  that  our  reduction  semantics  allows  for  /^-reductions  inside  Brackets,  thus  giving  us  a  basis 
for  verifying  the  soundness  of  the  safe-/?  optimization  that  we  discussed  in  previous  work  [25]. 

MetaML  currently  exists  as  a  prototype  implementation  that  we  intend  to  distribute  freely  on 
the  web.  The  implementation  supports  the  three  programming  constructs,  higher-order  datatypes 
(with  support  for  Monads),  Hindley-Milner  polymorphism,  recursion,  and  mutable  state.  The 
system  has  been  used  for  developing  a  number  of  small  applications,  including  simply  term-rewriting 
system,  monadic  staged  compilers,  and  numerous  small  bench-mark  functions. 

We  are  currently  investigating  the  incorporation  of  an  explicit  recursion  operator  and  Hindley- 
Milner  polymorphism  into  the  type  system  presented  in  this  paper. 

Acknowledgements:  We  would  like  to  thank  John  Matthews  and  Matt  Saffell  for  comments  on 
a  draft  of  this  paper. 
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The  Anatomy  of  a  Component  Generator' 

WalidTaha&  Jim  Hook 
{walidt,  hook}®  cse.ogi.edu 
The  Oregon  Graduate  Institute 


In  this  extended  abstract,  we  outline  some  essential  elements  of  a  conceptual  model  for  a  component  generation 
system.  This  model  is  based  on  an  extensive  study  of  a  large  number  of  high-level  program  generation  systems,  and 
the  sipificant  body  of  related  literature.  We  focus  our  attention  on  the  architectural  elements  of  this  model,  and 
briefly  discuss  the  technological  and  process  elements.  We  show  how  the  model  is  a  useful  basis  for  comparing 
component  generation  technologies.  With  a  rapidly  growing  area  like  component  generation,  it  is  hard  to  get  a  truly 
representative  sample  of  generators.  As  a  workaround,  we  illustrate  our  model  using  seven  significant  component 
generation  systems  developed  by  various  research  groups,  and  discuss  some  insights  that  the  model  provides.  We 
conclude  with  an  overview  of  the  current  status  of  our  investigation. 

1.  The  Pragmatic  Need  for  Models 

We  know  that  component  generation  can  be  very  beneficial  for  evolving  systems,  but  we  don’t  have  a  widely-accepted 
conceptual  model  for  component  generation  systems.  Conceptual  models  allow  us  to  categorize  and  distill  our 
knowledge  of  details  into  more  manageable  and  structured  information.  We  believe  that  such  a  model  would  facilitate 
better  communication  of  ideas,  within  our  own  research  group  (PacSoft),  within  the  component  generation  research 
area,  within  the  programming  languages  area,  and  with  the  outside  world.  For  example,  it  will  necessarily  play  an 
important  role  in  transferring  our  ideas  as  a  research  community  to  software  houses  that  can  develop  industry-strength, 
general  purpose  component  generators. 

We  have  been  working  towards  such  a  model  for  almost  three  years  now,  and  have  studied  over  100  related 
publications,  in  addition  to  being  involved  in  PacSoft*s  SDRR  component  generation  project  [KMB96].  Why  has  it 
taken  so  much  effort?  The  major  hurdle  is  that  interesting  component  generation  systems  emerge  from  many  corners 
of  computer  science,  which  often  means  incompatible  vocabularies.  For  example,  the  word  “Component”  can  have 
significantly  different  meanings  in  different  papers^.  The  diversity  of  programming  languages,  operating  systems,  and 
tools  used  in  developing  the  generators,  and  of  the  researchers’  expectations  from  ail  of  these,  add  significantly  to  the 
difficulty  of  understanding  the  literature  in  a  manner  that  would  allows  us  to  compare  and  contrast  two  different 
generation  technologies. 

2.  The  Architectural  Element 

Software  architectures  [PW92]  communicate  ideas  about  software  systems,  and  arc  especially  useful  when  parties 
involved  come  from  a  variety  of  different  backgrounds.  Architectural  descriptions  provide  an  abstract  basis  for  our 
model,  a  basis  that  is  independent  of  the  technology  underlying  the  generator,  the  development  process,  and  the 
^plication  domain. 

Even  when  composed  of  relatively  simple  subsystems,  the  collective  architecture  of  a  generator  is  often  quite  complex, 
and  involves  a  significant  number  of  distinct  artifacts  and  users.  Artifacts  include  the  generator,  the  input  and  output 
of  the  generator,  libraries,  and  the  legacy  system  hosting  the  generated  component.  Users  include  the  developers  of 
the  generator,  it’s  input,  and  the  libraries.  Ideally,  the  input  to  the  generator  is  a  simple,  compact  specification  that  is 
easy  to  maintain.  However,  it  is  often  the  case  that  an  executable  program  cannot  be  generated  solely  from  such 


'  This  research  is  supported  by  a  contract  with  the  USAF  Materiel  Command.  Contract  F19628-93-C-0069. 
^  In  this  paper,  it  will  mean  CORBA/COM-like  components. 
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specifications.  Therefore,  it  is  common  to  find  an  additional  (specification)  language,  often  in  the  form  of 
annotations,  for  controlling  the  generator.  There  may  even  be  a  developer  dedicated  to  this  task. 

Hence,  a  model  for  component  generators  should  admit  all  possible  answers  to  the  following  questions: 

•  What  is  the  input  to  the  generator?  Who  writes  this  input? 

•  What  is  the  output  of  the  generator?  Who  uses  it? 

•  What  libraries  does  the  output  use?  Who  writes  these  libraries? 

•  How  does  the  generator  work?  Who  wrote  it,  and  how?  How  is  it  controlled? 

•  With  what  systems  does  the  generated  component  interact? 

While  it  is  not  common  to  consider  all  of  these  dimensions  of  variability  simultaneously,  this  is  precisely  what  is 
needed  when  we  wish  to  relate  and  contrast  more  than  one  existing  component  generation  system.  The  figure  below  is 
a  schematic  representing  the  minimal  architectural  schema  that  arises  if  the  answer  to  the  each  of  the  above  questions 
is  distinct. 


The  figure  above  explicates  the  implicit  coiriplexity  of  even  the  simplest  generative  system.  For  instance,  consider  the 
yacc  parser>generator  [Joh75].  Develqtment  work  on  the  generator  itself  has  stopped,  and  hence,  we  usually  don’t 
think  of  either  the  developer  or  the  source  yacc.c.  The  generator  input  is  the  grammar  proper,  and  the  control 
annotations  are  the  directives  regarding  precedence  and  association.  Note  that  control  annotations  need  not  be  in  a 
sq)arate  file.  The  component  developer  and  the  generator  controller  are  the  same  person.  The  granunar  file  could 
also  contain  further  control  instructions  about  what  library  files  the  generator  output  might  be  using.  The  libraries 
used  by  the  generator  output  include  lib.y.c,  which  contains  the  abstract  machine  fiar  the  parse  table  The  interface  is 
usually  header  files  describing  the  legacy  system  functions  that  the  parser  uses.  Hnally,  while  we  rarely  see  a  user 
directly  interacting^  with  the  parser  generated  by  yacc,  the  user  of  the  legacy  system  is,  indirectly,  the  component  user. 


'  Drawn  in  the  Generator  Description  Language,  GDL  [TS97). 

^  Interaction  commutes,  and  hence,  we  could  have  drawn  the  conqxanent  user  directly  connected  to  the  generated 
component,  and  the  diagram  would  have  had  the  same  meaning. 
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Z 1  Basic  Distinguishing  Characteristics 

f' 

Certain  aspects  of  the  architecture  sketched  in  the  last  section  are  “not  negotiable”:  a  generative  architecture  has  to 
include  a  generator,  a  generator  input,  and  a  generated  component  And  every  artifact  that  is  not  mechanically 
generated  must  have  an  author.  The  architecture  described  above  gives  us  a  very  natural  basis  for  our  model  that 
C2^)tures  these  essential  invariants.  However,  it  offers  too  many  dimensions  of  variability.  The  design  space  is  indeed 
vast  But  some  of  these  dimensions  are  more  informative  than  others,  in  that  they  are  better  discriminators  between 
various  component  generation  systems.  We  have  identitied  basic  dtstinguishing  characteristics: 

1.  Who  is  the  primary  user,  that  is,  the  “customer"*  the  system  is  intended  to  benefit? 

1  What  expertise  is  expected  from  the  main  user? 

3.  Which  users  are  distinct,  and  which  users  are  not?  For  example,  is  the  role  of  generator  development  identified 
with  the  role  of  generator  control? 

4.  Does  the  generator  have  a  distinct  notion  of  control  annotations? 

These  factors  arc  derived  or  computed  from  the  architectural  variabilities.  In  the  following  section,  we  illustrate  the 
relevance  of  these  criteria  by  considering  some  important  generative  systems. 


2.2  Appiication  to  Seven  Research  Component  Generation  Systems 

For  brevity,  we  will  not  review  all  the  systems  we  have  studied.  Instead,  we  present  summary  of  our  observations,  and 
then  illustrate  how  these  observation  can  be  interpreted.  In  the  following  table,  between  two  different  kinds  of 
users  means  that  we  did  not  find  them  to  be  treated  differently.  In  cases  where  there  is  no  explicit  notion  of  control 
annotations,  the  input  to  the  generator  can  be  viewed  as  being  an  “  Implicit"  control  specification; 


Systems 

Primary 

Uscr(s) 

Primary  User’s 
Expertise 

Distinct  Users 

Control 

Annotations 

ISI  [Bal81,BaI92] 

GD.CD 

GD:  Meta-programmer, 
CU:  Domain  expert 

CU,  CD=LD,  GC=GD 

Pragmas 

MIP  [MKS971 

CU 

Domain  expert 

CU=CD=GC,  GD,  LD 

Implicit 

GcnVoca  [BST+94] 

LD 

Programmer 

CU,  CD=GC=LD,  GD 

Design  rules 

KIDS  /  SpeeWare  [Smi90.  SJ94] 

CD 

Formal  methods  expert 

CU,CD=GC=LD,GD 

Refinements 

SDRR  [BH+94,  KMB96] 

GD,  CD 

Domain  expert 

CU=CD,  GC=GD=LD 

Implicit 

Amphion  [LPP+94] 

CU 

Domain  expert 

CU=CD,  GC=GD.  LD 

Implicit 

AOP  [GLM+971 

CD 

Programmer 

CU,  CD=GC=GD=LD 

Aspects 

Let  us  consider  the  first  case:  In  the  ISI  technology,  the  generator  developer  (GD)  uses  the  POPART  meta- 
programming  tool-kit  and  a  relational  extension  of  C  or  Java  to  develop  the  generator  [Bal92,Wil81,Wil90].  In  the 
literature  we  surveyed,  the  roles  of  the  generator  controller  (GC)  and  generator  developer  were  not  distinguishable. 
Pragmas  are  used  to  guide  the  relational  compiler  as  to  how  to  implement  relations. 
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The  following  sub-sections  discuss  two  of  the  main  observations  that  can  be  drawn  on  the  basis  of  this  information. 

2.2.1  What  to  Mix,  and  What  to  Match 

Consider  the  UnH  of  information  that  might  interest  a  software  engineer  interested  in  building  a  component  generator. 
Some  technologies  address  similar  classes  of  users,  such  as  ISI  and  SDKR,  and  MIP  and  Amphion.  This  means  that 
these  technologies  could  be  a  good  basis  for  synthetic  systems  combining  the  beneftts  of  both.  For  example,  SDRR’s 
technology,  vrttich  leverages  on  functional  programming,  can  benefit  greatly  from  ISI’s  meta-programming 
technology,  and  vise  versa.  When  a  basic  distinguishing  characteristic  identifies  two  systems,  there  are  usually  many 
other  (often  less-abstract)  dimensions  in  which  they  are  different  For  example,  MIP  and  Amphion  fall  on  distinct 
points  along  the  dimension  of  real-time  constraints.  We  consider  this  dimension  to  be  somewdiat  less  abstract  than 
architecture  it  is  more  dependent  on  the  application  domain.  Some  of  these  dimensions  should  be  in  a  model 

for  component  generators,  discussed  in  the  next  section. 

Other  technologies  address  users  that  are  usually  not  emphasized  by  others.  For  example,  GenVoca  is  unique  in 
addressing  concerns  of  the  library  developer  (LD).  This  suggests  that  high-level  ideas  from  the  GenVoca  system 
might  be  readily  combinable  with  generation  technologies  covered  in  our  survey. 

2.2.2  How  to  Control  Generation 

Four  very  different  kinds  of  annotations  are  being  considered  by  three  different  groups,  namely,  ISI’s  pragmas, 
GenVoca’  design-rules,  KIDS  and  SpeeWare  refinements,  and  AOP’s  aspects.  These  annotations  are  an  impOTtant 
characteristic  of  modem  component  generation  systems  that  was  not  commonplace  in  earlier  transformational 
programming  systems. 

Control  annotations  can  be  viewed  as  Domain-Specific  Languages  (DSLs).  For  example,  yacc’s  specifications  for 
precedence  of  iterators  is  one  such  DSL.  In  this  light,  we  can  say  that  the  first  three  kinds  of  annotations  are  single 
languages,  and  AOP’s  aspects  can  be  thought  of  as  families  of  DSLs.  We  believe  that  the  study  of  these  generator- 
control  DSLs  will  play  an  important  role  in  developing  general-purpose,  industry-standard  component  generation 
systems. 

3.  Technology  and  Process  Elements 


Our  model  also  includes  two  other  elements;  the  technology  underlying  the  generation  system,  and  the  process  by 
wfrich  the  generator  itself  is  developed.  Both  can  be  viewed  as  refinements  of  the  architectural  model.  The  following 
table  summarizes  some  distinguishing  characteristics  of  the  systems  surveyed:  _ 


Systems 

Underlying  Technology 

Generator  Development 

ISI 

Meta-prograinming  calculus  and  tools 

Using  POPART  tools  and  relational  C  or  Ada 

MIP 

Model-integrated  real-time  control 

Using  the  MIP  paradigm 

GenVoca 

Algorithm  selection  and  object- 
orientation 

Using  design  rules  to  specify  acceptable  library 
combinations 

KIDS/ 

SpeeWare 

Fbnnal  verification 

Using  specifications  and  refinements  to  characterize  and 
derive  programs 

SDRR 

Typed,  functional  programming 

Using  SDRR  to  create  the  front-end  of  the  SDRR  pipeline 
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Amphion 

Theorem  proving  and  program 
synthesis 

Using  Meta-Amphion,  a  theory  of  the  domain,  and  an 
inference  engine 

AOP 

AOP 

Using  (any  technology?)  to  develop  a  weaver  and  aspects 

4.  Conclusion 


We  have  outlined  a  model  for  component  generation  systems  that  we  arc  currently  developing.  The  model  captures 
some  of  the  bare  essentials  required  for  an  object  of  study  to  be  considered  a  generator,  without  going  too  deeply  into 
the  details  of  any  particular  system.  We  illustrated  how  it  admits  simple,  clear,  and  objective  criteria  for  comparing 
component  generation  systems.  Our  work  shows  that  there  is  significant  diversity  not  only  in  the  cultures  and 
application  domains  of  contemporary  component  generation  research  projects,  but  also  in  technical  problems  that  are 
unique  to  the  emerging  research  area  of  component  generation,  such  widespread  interest  in  generation  control. 

Acknowledgments:  We  thank  Usa  Walton,  Andrew  Black,  Sherri  Shubnan,  Tao  Autry,  Therese  Fisher,  Shailish  Godbole,  and  Amol  Vyas  rfor 
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Abstract 

We  describe  a  type  system  and  typed  semantics  for  call-by«value  functional  languages  that 
use  a  hierarchy  of  monads  to  describe  and  delimit  a  variety  of  effects,  including  non-termination, 
exceptions,  and  state.  The  type  system  and  semantics  can  be  used  to  organize  and  justify  a 
variety  of  optimizing  transformations  in  the  presence  of  effects.  In  addition,  we  describe  a 
simple  monad  inferencing  algorithm  that  computes  the  minimum  effect  for  each  subexpression 
of  a  program,  and  provides  more  accurate  effects  information  than  local  syntactic  methods. 
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1  Introduction 


Optimizers  are  often  implemented  as  engines  that  repeatedly  apply  improving  transformations  to 
programs.  Among  the  most  important  transformations  are  propagation  of  values  from  their  defining 
site  to  their  use  site,  and  hoisting  of  invariant  computations  out  of  loops.  If  we  use  a  pure  (side- 
efl’ect-free)  language  based  on  the  lambda  calculus  as  our  compiler  intermediate  language,  these 
transformations  can  be  neatly  described  by  the  simple  rules  for  beta-reduction 

(Beta)  Let  x  =  e  in  6  =  b[e/x] 

and  for  the  interchange  and  lifting  of  bindings 

(Exchange)  Let  xi  =  ei  in  (Let  X2  =  62  in  b) 

=  Let  X2  -  62  in  (let  Xi  =  ei  in  b) 

(xi  ^  FV{e2);  X2  i  FV{ei)) 

(RecHoist)  Letrec  /  =  (A  x.let  y  =  ei  in  62)  in  b 

-  let  y  =  ei  in  (letrec  /  =  A  x.e2  in  6) 

ixj  i  Fy(ei);  y  i  FV{b)) 

where  the  side  conditions  nicely  express  the  data  dependence  conditions  under  which  the  transfor¬ 
mations  are  valid. ^  Effective  compilers  for  pure,  lazy  functional  languages  (e.g.,  [10])  have  been 
conceived  and  built  on  the  basis  of  such  transformations,  with  considerable  advantages  for  modu¬ 
larity  and  correctness. 

It  would  be  nice  to  apply  similar  methods  to  the  optimization  of  languages  like  ML,  which 
have  side  effects  such  as  I/O,  mutable  state,  and  exceptions.  Unfortunately,  these  “rearranging” 
transformations  are  not  generally  valid  for  such  languages.  For  example,  if  we  apply  (Beta)  in  a 
situation  where  evaluating  e  performs  output  and  x  is  mentioned  twice  in  6,  evaluating  the  resulting 
expression  might  produce  the  output  twice.  In  fact,  once  an  eager  evaluation  order  is  fixed,  even 
non-termination  becomes  a  “side  effect.”  For  example,  (RecHoist)  is  not  valid  unless  ei  is  known 
to  be  terminating  (and  free  of  other  effects  too,  of  course). 

A  similar  challenge  long  faced  lazy  functional  languages  at  the  source  level:  how  can  we  give 
the  power  of  side-effecting  operations  without  invalidating  simple  “equational  reasoning”  based  on 
(Beta)  and  similar  rules?  The  effective  solution  discovered  in  that  context  is  to  use  monads  [8,  12]. 
An  obvious  idea,  therefore,  is  to  use  monads  in  an  internal  representation  (IR)  for  compilers  of 
call-by-value  languages.  Some  initial  steps  in  this  direction  were  recently  taken  by  Peyton  Jones, 
Launchbury,  Shields,  and  Tolmach  [11].  The  aim  of  that  work  was  to  design  an  IR  suitable  for 
both  eager  and  lazy  source  languages.  In  this  paper  we  pursue  the  use  of  monads  with  particular 
reference  to  eager  languages  (only),  and  address  the  question  of  how  to  discover  and  record  several 
different  sorts  of  effects  in  a  single,  unified  monadic  type  system.  We  introduce  a  hierarchy  of 

*Of  course,  the  fact  that  a  transformation  is  valid  doesn’t  mean  that  applying  it  will  necessarily  improve  the 
program.  For  example,  (Beta)  is  not  an  improving  transformation  if  e  is  expensive  to  compute  and  x  appears  many 
times  in  b\  similarly,  (RecHoist)  is  not  improving  if  /  is  not  applied  in  b. 
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monads,  ordered  by  increasing  “strength  of  effect,”  and  an  inference  algorithm  for  annotating 
source  program  subexpressions  with  their  minimal  effect. 

Past  approaches  to  coping  with  effects  have  fallen  into  two  main  camps.  One  approach  approach 
(used,  e.g.,  by  SML  of  New  Jersey  [2]  and  the  TIL  compiler  [16])  is  to  fall  back  on  a  weaker  form 
of  (Beta),  called  (Beta„),  which  is  valid  in  eager  settings.  (Beta,,)  restricts  the  bound  expression 
e  to  variables,  constants,  and  A- abstractions;  since  “evaluating”  these  expressions  never  actually 
causes  any  computation,  they  can  be  moved  and  substituted  with  impunity.  To  augment  this  rule, 
these  compilers  use  local  syntactic  analysis  to  discover  expressions  that  are  demonstrably  pure  and 
terminating.  These  analyses  cannot  “see  through”  function  calls,  but  they  can  be  quite  effective, 
particularly  if  the  compiler  inlines  functions  enthusiastically.  The  other  approach  (used,  e.g.,  by 
the  ML  Kit  compiler  [4])  uses  a  sophisticated  effect  inference  system  [14]  to  track  the  latent  effects 
of  functions  on  a  very  detailed  basis.  The  goals  of  this  school  are  typically  more  far-reaching;  the 
aim  is  to  use  effects  information  to  provide  more  generous  polymorphic  generalization  rules  (e.g., 
as  in  [19,  15]),  or  to  perform  significantly  more  sophisticated  optimizations,  such  as  automatic 
parallelization  or  stack-allocation  of  heap-like  data.  In  support  of  these  goals,  effect  inference  has 
generally  been  used  to  track  store  effects  at  a  fine-grained  level. 

Our  approach  is  essentially  a  simple  monomorphic  variant  of  effect  inference  applied  to  a  wider 
variety  of  effects  (including  non-termination,  exceptions,  and  10),  cast  in  monadic  form,  and  in¬ 
tended  to  support  transformational  code-motion  optimizations.  We  infer  information  about  latent 
effects,  but  we  do  not  attempt  to  calculate  effects  at  a  very  fine  level  of  granularity,  lii  return, 
our  inference  system  is  particularly  simple  to  state  and  implement.  However,  there  is  nothing 
fundamentally  new  about  our  system  as  compared  with  that  of  Talpin  and  Jouvelot  [14].  except 
our  decision  to  use  a  monadic  syntax  and  validate  it  using  a  typed  monadic  semantics.  A  practical 
advantage  of  the  monadic  syntax  is  that  it  makes  it  easy  to  reflect  the  results  of  the  effect  inforcuiec' 
in  the  program  itself,  where  they  can  be  easily  consulted  (and  kept  up  to  date)  by  subsequent 
optimizations,  rather  than  in  an  auxiliary  data  structure.  An  advantage  of  the  monadic  semantics 
is  that  it  provides  a  natural  foundation  for  probing  and  proving  the  correctness  of  transformations 
in  the  presence  of  a  variety  of  effects. 

In  related  work,  Wadler  [18]  has  recently  and  independently  shown  that  Talpin  and  Jouvelot’s 
effect  inference  system  can  be  applied  in  a  monadic  framework;  he  uses  an  untyped  semantics,  and 
considers  only  store  effects.  In  another  independent  project,  Benton  and  Kennedy  are  prototyping 
an  ML  compiler  using  a  monadic  encoding  similar  to  ours  [3]. 

2  Source  Language 

This  section  briefly  describes  an  ML-like  source  language  we  use  to  explain  our  approach.  The 
call-by- value  source  language  is  presented  in  Figure  1.  It  is  a  simple,  monomorphic  variant  of  ML. 
expressed  in  A-normal  form  [5],  which  explicitly  binds  a  name  to  the  result  of  each  computation 
and  makes  evaluation  order  completely  explicit.  The  class  const  includes  primitive  functions  as 
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datatype  value  = 

Var  of  var 
I  Const  of  const 

datatype  const  = 

Integer  of  int 
I  True  I  False 
I  DivByZero  1  ... 

I  Plus  I  Minus  I  Times  I  Divide 
I  Eqint  I  Ltint 
I  EqBool  I  EqExn 
I  Writeint 


datatype  exp  = 

Val  of  value 
I  Abs  of  var  *  exp 
I  App  of  value  *  value 
I  If  of  value  *  exp  *  exp 
I  Let  of  var  ★  exp  *  exp 
I  Letrec  of  var  *  var  *  exp  ♦  exp 
I  Tuple  of  value  list 
I  Project  of  int  *  int  *  value 
I  Raise  of  value 
I  Handle  of  exp  ♦  value 


Figure  1:  Abstract  Syntax  for  Source  Language  (presented  as  ML  datatype). 

well  as  constants.  The  Let  construct  is  monomorphic;  that  is,  Let(x,e,fe)  has  the  same  semantics 
and  typing  properties  as  would  App  (Abs  (a:,  6)  ,e)  (were  this  legal  A-normal  form).  The  restriction 
to  a  monomorphic  language  is  not  essential;  see  Section  5.  All  functions  are  unary;  primitives  like 
Plus  t.rik(‘  a  t  w()-('l('inont  tuple  avS  argument.  For  simplicity  of  presentation,  we  restrict  Letrec  to 
single  functions. 

The  language  is  not  explicitly  typed,  but  the  underlying  types  include  the  base  types  Int,  Bool, 
and  Exn,  tuples,  and  arrows.  We  use  tuples  as  a  surrogate  for  more  general  algebraic  datatypes; 
to  permit  type  inference  for  Projects  in  the  absence  of  declarations,  we  provide  the  total  size 
of  the  tuple  as  an  additional  parameter.  We  assume  a  supply  of  appropriate  constants  for  each 
base  type.  Exceptions  carry  values  of  type  Exn,  which  are  nullary  exception  constructors.  Raise 
takes  an  exception  constructor;  rather  than  providing  a  means  for  declaring  such  constructors, 
we  assume  an  arbitrary  pool  of  constructor  constants.  Handle  catches  all  exceptions  that  are 
raised  while  evaluating  its  first  argument  and  passes  the  associated  exception  value  to  its  second 
argument,  which  must  be  a  handler  function  expecting  an  Exn.  The  body  of  the  handler  function 
may  or  may  not  choose  to  reraise  the  exception  depending  on  its  value,  which  may  be  tested  using 
EqExn.  The  primitive  function  Divide  has  the  potential  to  raise  a  particular  exception  DivByZero. 
We  supply  Writeint  as  a  paradigmatic  state- altering  primitive;  internal  side-effects  such  as  ML 
reference  manipulations  would  be  handled  similarly.  All  other  primitives  are  pure  and  guaranteed 
to  terminate.  The  semantics  of  the  remainder  of  the  language  are  completely  ordinary. 

3  Intermediate  Language  with  Monadic  Types 

Figure  2  shows  the  abstract  synteix  of  our  monadic  intermediate  representation  (IR).  (For  an  exam- 
pl(;  of  the  c:ode,  look  ahead  to  Figure  10.)  For  the  most  part,  terms  are  the  same  as  in  the  source 
language,  but  with  the  addition  of  monad  annotations  on  Let  and  Handle  constructs  and  a  new 
Up  construct;  these  are  described  in  detail  below.  In  addition,  identifiers  (and  Raise  expressions) 
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datatype  monad  =  ID  I  LIFT  1  EXN  I  ST 

datatype  mtyp  =  M  of  monad  ♦  vtyp 
and  vtyp  = 

Int 

I  Bool 
I  Exn 

1  Tup  of  vtyp  list 
I  Arrow  of  vtyp  *  mtyp 

type  varty  =  var  *  vtyp 


datatype  value  = 

Var  of  var 
1  Const  of  const 

datatype  exp  = 

Val  of  value 
I  Abs  of  varty  *  exp 
1  App  of  value  *  value 
I  If  of  value  *  exp  *  exp 
1  Let  of  monad  *  varty  *  exp  *  exp 
I  Letrec  of  varty  *  varty  *  exp  ♦  exp 
I  Tuple  of  value  list 
I  Project  of  int  *  int  ♦  value 
I  Raise  of  mtyp  *  value 
I  Handle  of  monad  *  exp  ♦  value 
I  Up  of  monad  *  monad  *  exp 


Figure  2:  Abstract  Syntax  for  Monadic  Typed  Intermediate  Language. 

are  explicitly  typed,  in  order  that  we  may  easily  compute  the  type  of  any  closed  expression. 

Values  have  ordinary  value  types  (vtyps);  expressions  have  monadic  types  (mtyps),  which  in¬ 
corporate  a  vtyp  and  a  monad  (possibly  the  ID  monad).  Since  this  is  a  call- by- value  language,  the 
domain  of  each  arrow  types  is  a  vtyp,  but  the  codomain  is  an  arbitrary  mtyp.  The  typing  rules 
are  given  in  Figure  3.  In  this  figure,  and  throughout  our  discussion,  t  ranges  value  types,  m  over 
monads,  v  over  values,  c  over  constants,  x  over  variables,  and  e  over  expressions.  The  initial  type 
environment  is  described  in  Figure  4. 

For  this  presentation,  we  use  four  monads  arranged  in  a  simple  linear  order.  In  order  of  “in¬ 
creasing  eflfect”  these  are: 

•  ID,  the  identity  monad,  which  describes  pure,  terminating  computations. 

•  LIFT,  the  lifting  monad,  which  describes  pure  but  potentially  non-terminating  computations. 

•  EXN,  the  monad  of  exceptions  and  lifting,  which  describes  computations  that  may  raise  an 
(uncaught)  exception,  and  are  potentially  non-terminating. 

•  ST,  the  monad  of  state,  exceptions,  and  lifting,  which  describes  computations  that  may  write 
to  the  “outside  world,”  may  raise  an  exception,  and  are  potentially  non-terminating. 

We  write  mi  <  m2  iff  mi  precedes  m2  on  this  list.  Intuitively,  mi  <  m2  implies  that  computations 
in  m2  are  “more  effectful”  than  those  in  mi;  they  can  provoke  any  of  the  effects  in  mi  and  then 
some.  This  particular  hierarchy  captures  most  of  the  interesting  distinctions  and  still  gives  us  a 
simple  inference  algorithm  (see  Section  5).  More  elaborately  stratified  monadic  structure  is  certainly 
possible;  we  discuss  this  in  more  detail  below. 

More  formally,  mi  <  m2  implies  that  there  exists  an  embedding  up,7j,_+m2  which,  for  every  value 
type  t,  maps  the  domain  corresponding  to  M(mi  ,t)  into  the  domain  corresponding  to  M(m2  ,t)  •  The 
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E(v)  =  t 
E  Var  V  :  t 


Typeof  (c)  =  t 
E  l-„  Const  c  :  t 

E  \-y  V  :  t 

EhV&l  v:M(ID,i) 

E  +  {x  :  ti}  h  e  :  M(m2,<2) 

Eh  Abs(x:fi,e)  :  M(ID,ii M(m2  ,<2)) 

E  \-v  vi  :  ti  M(m2  ,t2)  E  l-y  V2  '•  ti 
EhApp(vi,W2)  :M(m2,i2) 

E  h„  t;  :  Bool  E  h  ej  :  E  h  62  :  M(m,0 

E  h  If  (v,ei ,62)  : 

E  h  ei  :  M(mi  ,<i)  E  +  {a:  :  ii}  h  62  :  M(m2  ,<2)  {'m\<m2) 

E  h  LetCmi  ,m2,x  :  ti  ,ei  ,62)  :  M(m2,i2) 


E  +  {f  :to->  ;  fo}  •“  ei  :  MCmi.ii)  ^  ^ 

■E  + {/ :  *0 ->  h  62  :  M(m2,f2)  ~  ^ 

E  h  Letrec(/  :  to  ->  ,a; :  to. 61,62)  :  M(m2,t2) 

E  h y  V\  •  tl  ...  E  h u  Tlyj  *  tji 
Eh  Tuple  (vi,...,v„)  :  M(ID,Tup(ti .  .,t„)) 

E  h„  u  :  Tup(ti .  ..tn)  (0  <  t  <  n) 

E  h  Project (t.n.u)  :  MClD.tj) 

_ E  ht,  u  :  Exn _ 

E  h  Raise (M(EXN,t)  ,v)  :  M(EXN,t) 

Ehe:M(m,t)  E  h„  u  :  Exn  ^  M(m,t)  (EXN  <  m) 

E  h  Handle(m,e,u)  ;  M(m,t) 

E  h  6  ;  M(mi  ,t)  (mi  <  m2) 

E  h  Up(mi  ,m2,e)  :M(m2,t) 


Figure  3:  Typing  rules  for  intermediate  language 
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Integer  _  :  Int 
True, False  :  Bool 
DivByZero  :  Exn 

Plus, Minus, Times  ;  Arrow(Tup[Int , Int] ,M(ID, Int) ) 
Divide:  Arrow (Tup [Int , Int] ,M (EXN, Int)) 
EqIntjLtInt:  Arrow(Tup [Int , Int] ,M(ID,Bool)) 
EqBool :  Arrow(Tup [Bool , Bool] ,M(ID,Bool)) 

EqExn:  Arrow (Tup [Exn, Exn] ,M(ID,Bool)) 

Writeint:  Arrowdnt ,M(ST,Tup[] ) ) 


Figure  4:  Typings  for  constants  in  initial  environment 


up  functions  generalize  the  more  usual  monad  unit  operations:  wpjp_^^(e)  is  equivalent  to  unitr,i{e). 
Each  monad  m  also  has  a  conventional  bindm  operation  which  serves  to  compose  computations  in 
m.  Figure  5  gives  semantic  interpretations  for  types  as  complete  partial  orders  (CVO's),  and 
for  our  monads,  together  with  the  associated  up  and  bind  functions.  Note  that  the  up  functions 
are  defined  in  such  a  way  that  they  compose,  i.e.,  for  all  mo  <  mi  <  m2,  we  have  = 

A  typed  semantics  for  terms  is  given  in  Figures  6  and  7.  Environments  p  map  identifiers  to 
values.  This  semantics  is  largely  straightforward.  However,  the  Let  construct  now  serves  to  make 
the  composition  of  monadic  computations  explicit,  and  the  Up  construct  makes  monadic  coercions 
explicit.  Intuitively, 

Let  (mi  ,7712  ,  (x,ti)  ,ei  ,62) 

evaluates  ei,  which  has  monadic  type  M(mi  ,t),  performing  any  associat.cui  (‘ttccts.  l)iii(ls  (.Ik*  icsuli- 
ing  value  to  x  :  ti^  and  then  evaluates  62,  which  has  monadic  type  M(m2,t2).  Thus,  it  essentially 
plays  the  role  of  the  usual  monadic  bind  operation;  in  particular,  if  mi  =  m2,  the  semantic  inter¬ 
pretation  of  the  above  expression  in  environment  p  is  just 

bindmi{€lei}p){Xy£le2lp[x  ;=  y]) 

However,  our  typing  rules  (Figure  3)  require  only  that  m2  >  mi;  i.e.,  62  may  be  in  a  more  effectful 
monad  than  e.  The  semantics  of  a  general  “mixed-monad”  Let  is 

bindmiiupmi  —>•7712  {£lei\p)){Xy.Sle2\p[c  :=  y]) 

The  term  Let  (Up (mi  ,m2  ,ei)  ,m2 ,  (ar,i)  ,ei  ,62)  has  the  same  semantics,  so  the  more  general  form 
of  Let  is  strictly  redundant.  But  this  form  is  useful,  because  it  makes  it  easier  to  state  (and  recognize 
left-hand  sides  for)  many  interesting  transformations  involving  Let  whose  validity  depends  on  the 
monad  mi  rather  than  on  m2.  For  example,  a  “non- monadic”  Let,  for  which  (Beta)  is  always  valid, 
is  simply  one  in  which  mi  =  ID.  Further  examples  will  be  shown  in  the  next  section. 

The  semantics  of  the  “non-proper  morphism”  Handle (e,i;)  deserve  special  attention.  Expres¬ 
sion  e  may  be  in  either  EXN  or  ST,  and  the  meaning  of  Handle  depends  on  which;  the  ST  version 
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T  : vtyp 

CVO 

Tflntl 

= 

z 

riBooii 

= 

z 

(0  represents  false) 

riExn] 

= 

z 

T[Tup«i,...,t„)| 

= 

7”|tll  X  ...  X  7”|tnl 

(n  >  0) 

riTupoi 

= 

1 

TfArrowCti  ,M(m2,t2))l 

= 

T[til  ->  A^[m2l(T|t2l) 

M.  :  monad  CVO 

CVO 

M[lD]c 

= 

c 

>llLIFT]c 

= 

Cl 

A^fEXNlc 

= 

(Ok(c)+Pail(.E))^ 

CO 

= 

State  ->  ((Ok(c)  +Fail(2:))  x 

State) 

bindiQ  X  k 

= 

k  X 

6mt/LiFT  ^ 

= 

k  a 

if  X  =  oj. 

± 

if  X  =  ± 

6ind£xii  X  k 

= 

k  a 

if  X  =  Ok(a)j^ 

Fail(6)j_ 

if  X  =  Fail(6)j^ 

1 

if  X  =  ± 

bind^j  X  k  s 

= 

k  a  s' 

if  X  5 

=  (Ok(a),s')^ 

(Fail(6),s')i 

if  X  s 

=  (Fail(6),s')i 

if  X  5  =  ± 

^Pm—ym  ^ 

= 

X 

“PID->LIFT  ^ 

= 

x± 

«PID-^EXN  ® 

= 

Ok{x)j_ 

WFID-»ST  ^  « 

= 

(Ok(x),s)jL 

“i^IFT-fEXN  ® 

= 

Ok(a)j^ 

if  X  =  Ox 

1 

if  X  =  ± 

WPLIFT->ST  ^  « 

= 

(Ok(a),s)_L 

if  X  =  Ox 

1 

if  X  =  J- 

^^PEXN-^ST  ^  « 

= 

(Ok(a),s)_L 

if  X  =  Ok(a)j^ 

(Fail(6).s)^ 

ifx  =  FSiil(6)^ 

± 

if  X  =  ± 

Figure  5:  Semantics  of  Types  and  Monads 
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V  :  (value  ;  t)  — >  Env 

m 

V|Var  v\p 

= 

p{v) 

V[Const  (Integer  0]p 

= 

i 

V[Const  Truejp 

= 

1 

V|Const  Falsejp 

= 

0 

V|Const  Plus|p 

= 

plus 

V|Const  Dividejp 

= 

divideby 

V[Const  Writelnt|p 

= 

writeint 

V|Const  DivByZero|p 

divbyO 

plus  (fli, 02) 

= 

Ol  +  02 

divideby  (01,02) 

= 

Ok(oi/o2)x  if  02  7^  0 

Fa.il{  divby  0)  ^  if  02  =  0 

State 

= 

[Z]  (sequence  of  integers  written  so  far) 

writeint  a  s 

= 

(Ok(),  append{s,  [o]))j^ 

divbyO 

= 

42 

Figure  6:  Semantics  of  Values 

must  manipulate  the  state  component.  Note  that  there  are  two  plausible  ways  to  combine  state 
with  exceptions;  in  our  semantics  we  have  given  (as  in  ML),  the  state  is  not  reverted  when  an 
exception  is  handled.  Incidentally,  we  don’t  have  to  give  a  semantics  when  c  is  in  ID  or  LIFT, 
because  the  typing  rule  for  Handle  disallows  these  cases.  Of  course,  these  cases  might  appear  in 
source  code;  when  typed  IR  is  generated  for  them,  e  must  be  coerced  into  EXN  with  an  explicit  Up.^ 
A  Raise  expression  is  handled  similarly;  the  typing  rules  force  it  into  monad  EXN,  so  semantics 
need  only  be  given  for  that  case,  but  the  whole  expression  may  be  coerced  into  ST  by  an  explicit 
Up  if  necessary. 

As  mentioned  above,  our  basic  approach  is  not  restricted  to  the  totally-ordered  set  of  monads 
presented  here.  It  extends  naturally  to  any  collection  of  monads  forming  a  finite  upper  semi-lattice 
under  the  up  embedding  operation.  It  does  not  suffice  to  have  a  partial  order;  we  insist  that  any 
two  monads  in  the  collection  have  a  least  upper  bound  with  respect  to  embedding,  so  that  we  can 
always  find  a  unique  monad  into  which  two  arbitrary  expressions  (e.g.,  the  two  arms  of  an  if)  can 
be  coerced.  One  might  be  tempted  to  describe  such  a  lattice  by  specifying  a  set  of  “primitive” 
monads  encapsulating  individual  effects,  and  then  assuming  the  existence  of  arbitrary  “union” 
monads  representing  combinations  of  effects.  As  the  Handle  discussion  indicates,  however,  there 
is  often  more  than  one  way  to  combine  two  effects,  so  that  it  makes  no  sense  to  talk  in  a  general 
way  about  the  “union”  of  two  monads.  Instead,  it  appears  necessary  to  specify  explicitly,  for  every 

^Another  possibility  is  to  drop  the  entire  Handle  in  favor  of  e,  which  by  its  type  cannot  raise  an  exception! 
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€  :  (exp  :  ^  Env 

A^MCT'W) 

£^|Val  v\p 

= 

Vlvjp 

f  |Abs(a;,e)]p 

= 

Ay.^[elp[a:  :=  y] 

£:|App(t;i,V2)lp 

(V[vi1p)  (V[v21p) 

fflfCv,  61,62)1/9 

= 

«/(VHp)  (^^[eilp)  (^^Ie2lp) 

£[Letrec(/,a:,ei,e2)lp 

= 

^Ie2l(p[/  :=  fix{Xf'.Xv.eiei}{p[f 

:=  f',x:=v]))]) 

^[TupleCvi,..  .,v„)l/9 

= 

(VIvi1p,...,VIu„Ip) 

5|Project(i,n,v)]/9 

= 

projiiVlvjp) 

^|Raise(M(EXN,0  ,v)}p 

(Fail(VHp))^ 

£|Handle(Tn,6,v)lp 

= 

handlem{£lejp){Vlvlp) 

£|Let(mi  ,m2  ,x,ei  ,62)]^ 

== 

bindmi{upj^^^^^{Slei}p)){Xy.£le2}p[x  ~  y]) 

£[Up(mi  ,m2,e)]p 

= 

^Pmi— >7712  Wp) 

if  V  at  a f 

= 

at 

if  V  ^  0 

Cf 

if  v  =  0 

proji{vu---,Vn) 

Vi 

handle^^f]  x  h 

= 

Ok(a)^ 

if  X  =  Ok(a)j^ 

h  a 

if  X  =  Fail(a)j^ 

1 

if  X  =  X 

handle^’l  x  h  s 

= 

(Ok(a),s')j^  if  a; 

s  =  (Ok(a),s')x 

has'  if  a; 

s  =  (Fail  (a),  s')j^ 

1 

if  X  s  =  X 

Figure  7:  Semantics  of  Expressions 

monad  m  in  the  lattice, 

•  a  semantic  interpretation  for  m; 

•  a  definition  for  bindm\ 

•  a  definition  of  for  each  m'  > 

•  for  each  non-proper  morphism  NP  introduced  in  m,  a  definition  of  np^t  for  every  >  m. 

The  lack  of  a  generic  mechanism  for  combining  monads  is  rather  unfortunate,  since  it  turns  the 
proofs  of  many  transformation  laws  into  lengthy  case  analyses;  we  conjecture  that  the  theory 
of  monad  transformers  [9]  might  help  organize  such  proofs  into  simpler  form,  but  have  not  yet 
attempted  to  apply  it. 

^Since  the  (IdentUp)  and  (ComposeUp)  laws  (see  Figure  8)  must  hold  in  a  partial  order,  it  suffices  to  define 
upm^m>  for  just  enough  choices  of  m'  to  guarantee  the  existence  of  least  upper  bounds,  since  these  definitions  will 
imply  the  definition  for  arbitrary  m'. 
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(IdentUp)  Up(m,m,e)  =  e 

(ComposeUp)  Up(mo,m2,e)  =  Up(mi ,m2 ,  (Up(mo,mi ,e)))  (mo  <  mi  <  m2) 

(Monadidi)  Let (m2,m3 ,rc,Up(mi  ,m2  ,e)  ,6)  =  Let(mi  ,m3,a:,e,6) 

(MonadId2)  Let  (mi  ,m2  ,x,e,Up(ID,m2,x))  =  Up(mi  ,m2  ,e)  .  (mi  <  m2) 

(Let Assoc)  Let(mi  ,m3,x,Let (m2,mi  ,y,ei  ,62)  >^)  = 

Let  (m2,  mi  ,y  ,ei ,  Let  (mi  ,m3  ,x,e2,&)) 

(m2  <  mi,  y  ^  FV{b)) 

(LetrecAssoc)  Let(mi , m2, a;,Letrec(/,y, 61,62)  >^^)  = 

Letrec  (/ ,  y ,  61 ,  Let  (mi ,  m2 ,  a: ,  62 ,  fc) ) 

(y  ^  FV{b)) 

(LetUp)  Let(mi,m3,a:,e,Up(m2,m3,6))  =  Up(m2,m3,Let(mi  ,m2,.7',6,/0) 


Figure  8:  Generalized  monad  laws 

4  Transformation  Rules 

In  this  section  we  attempt  to  motivate  our  IR,  and  in  particular  our  choice  of  monads,  by  presenting 
a  number  of  useful  transformation  laws,  which  can  be  proved  correct  with  respect  to  the  denotational 
semantics.  (These  proofs  are  straightforward  but  tedious,  so  are  omitted  liere.)  Oi  coiuse.  t  his  i.s 
by  no  means  a  complete  set  of  rules  needed  by  an  optimizer;  there  are  many  others,  both  geiuual- 
purpose  and  specific  to  particular  operators.  Also,  as  noted  earlier,  not  all  valid  transformatious 
are  improvements. 

Figure  8  gives  general  rules  for  manipulating  monadic  expressions.  (MonadlDj),  (MoiiadlD^)- 
and  (Let Assoc)  are  generalizations  of  the  usual  laws  for  a  single  monad,  which  can  be  recovered 
from  these  rules  by  setting  mi  =  ID  in  (MonadIDi),  and  setting  mi  =  m2  in  (MonadID2)  and 
(LetAssoc).  (LetrecAssoc)  is  the  corresponding  associativity  rule  for  Letrecs.  (LetUp)  permits 
us  to  move  expressions  with  weak  effects  in  and  out  of  coercions.  The  remaining  rules  let  us  do 
housekeeping  on  coercions. 

Figure  9  lists  some  valid  laws  for  altering  execution  order.  We  have  full  beta  reduction  for 
variables  bound  in  the  ID  monad  (BetalD).  In  general,  the  order  of  two  bindings  can  be  exchanged 
if  there  is  no  data  dependence  between  them,  and  if  either  of  them  is  in  the  ID  monad  (ExchangelD) 
or  both  are  in  or  below  the  LIFT  monad  (ExchangeLIFT).  The  intuition  for  the  latter  rule  is  that 
it  harmless  to  reorder  two  expressions  even  if  one  or  both  may  not  terminate,  because  we  cannot 
detect  which  one  causes  the  non-termination.  On  the  other  hand,  there  is  no  similar  rule  for  the  EXN 
monad,  because  we  con  distinguish  different  raised  exceptions  according  to  the  constructor  value 
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(B('talD)  Let  (ID,7n,x,e,6)  =  h[elx] 

(ExchangelD)  Let  (mi ,  m3 ,  Xi ,  ei ,  Let  (m2 ,  m3 , 0:2 , 62 ,  &) )  = 

Let(m2,m3,X2,e2,Let(mi,m3,a;i  ,ei  ,6)) 


(mi  =  ID  or  m2  =  IDjxi  ^  FV{e2)\X2  ^  FV{ei)) 


(ExchangeLIFT)  Let  (mi ,  m3 ,  xi ,  ei ,  Let  (m2 ,  m3 ,  X2 , 62 , 6) )  = 

Let  (m2 ,  m3 ,  X2 , 62 ,  Let  (mi ,  m3 , Xi ,  ei ,  6) ) 

(mi, m2  <  LIFT;  xi  ^  ^^(62);  2:2  ^  FV{ei)) 

(HoistID)  Letrec(/,x,Let (ID,m2,2/,ei  ,62)  ,6)  :  M(m,0  = 

Let  (ID,m,y  ,ei  ,Letrec(/,x,e2,6)) 

(/,x  ^  FV{e,)) 

(HoistEXN)  Letrec (/ , x , Let  (mi , m2 , y , ei , 62 ) ,  App (f  ,z))  = 

Let  (mi  ,m2,2/,ei  ,Letrec(/,x,e2,App(/,2))) 

(mi  <  EXN;  X,/  ^  FV{ei)) 

(IfTD)  If  (7sLet(ID,m.,.r  ,Pi  ,69)  ,63)  =  Let  (ID,m,x,ei , If  (i; ,62 ,63)) 

(x  ^  Fy(e3)) 


Figure  9:  Exchange  laws  for  monadic  expressions 

they  carry.  This  is  the  principal  point  of  diflFerence  between  LIFT  and  EXN  from  an  optimization 
standpoint. 

Rule  (HoistID)  states  that  it  always  valid  to  lift  a  pure  expression  out  of  a  Letrec  (if  no  data 
dependence  is  violated).  (HoistEXN)  reflects  a  much  stronger  property:  it  is  valid  to  lift  a  non¬ 
terminating  or  exception-raising  expression  of  a  Letrec  if  the  recursive  function  is  guaranteed  to  be 
executed  at  least  once.  This  is  the  principal  advantage  of  distinguishing  EXN  from  the  more  general 
ST  monad,  for  which  the  transform  is  not  valid.  Although  the  left-hand  side  of  (HoistEXN)  may 
seem  a  crude  way  to  characterize  functions  guaranteed  to  be  called  at  least  once,  and  unlikely  to 
appear  in  prac:tice,  it  arises  naturally  if  we  systematically  introduce  loop  headers  for  recursions  [1], 
according  to  the  following  law: 

(Header)  Letrec (/,x,e, 6)  :  M(m,0  = 

Let(ID,m,/,Abs(2,Letrec(/^x,e[/7/],App(/^J^)))  ,6) 

(/'  ^  FV{e)) 

Finally,  we  include  the  rule  (IfID)  as  an  example  of  the  flexibility  with  which  ID  expressions  can 
b(^  manipulated;  there  are  similar  rules  for  floating  ID  expressions  out  of  other  constructs. 

As  a  (rather  artificial)  example  of  the  power  of  these  transformations,  consider  the  code  in 
Figure  10.  The  computation  of  w  is  invariant,  so  we  would  like  to  hoist  it  above  recursive  function 
r.  Because  the  binding  for  w  is  marked  as  pure  and  terminating,  it  can  be  lifted  out  of  the  if  using 
(IfID),  and  can  then  be  exchanged  with  the  pure  bindings  for  s  and  t  using  (ExchangelD).  This 
positions  it  to  be  lifted  out  of  r  using  (HoistID).  Note  that  the  monad  annotations  tell  us  that  w  is 
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let  f:(Int  ->  M(ID,Int  *  Int))  ->  M(ST,Int)  = 
Ag: (Int~>M(ID,Int  *  Int)). 

letrec  r :Int~>M(ST,Int)  = 

AxrInt.letID  t:Int  ♦  Int  =  (x,l) 
in  letID  s:Bool  =  Eqlnt(t) 
in  if  s  then 


Up(ID,ST,0) 

else 

letID  w:Int  *  Int  =  g(3) 
in  letID  y:Int  =  Plus(w) 

in  letID  z:Int  *  int  =  (x,y) 
in  letEXN  x^:Int  =  Divide (z) 

in  letST  dummy:  ()  =  WriteInt(xO 
in  r(xO 


in  r(10) 

in  let  h:Int->M(ID,Int  *  Int)  =  Ap:Int.(p,p) 
in  f(h) 


Figure  10:  Example  of  intermediate  code,  presented  in  an  obvious  concrete  analogue  of  the  abstract 
syntax. 

pure  and  terminating  even  though  it  invokes  the  unknown  function  g,  which  is  actually  bound  to 

h. 

The  example  also  exposes  the  limitations  of  monomorphic  effects:  if  f  were  also  applied  to  an 
impure  function,  then  g  and  hence  w  would  be  marked  as  impure,  and  the  binding  for  w  could  not 
be  hoisted.  In  practice,  it  might  be  desirable  to  clone  separate  copies  of  f ,  specialized  according  to 
the  effectfulness  of  their  g  argument.  Worse  yet,  consider  a  function  that  is  naturally  parametric 
in  its  effect,  such  as  map.  Such  a  function  will  always  be  pessimistically  annotated  with  an  effect 
reflecting  the  most-eflFectful  function  passed  to  it  within  the  program.  The  obvious  solution  is  to 
give  functions  like  map  a  generic  type  abstracted  over  a  monad  variable,  analagous  to  an  effect 
variable  in  the  system  of  Talpin  and  Jouvelot  [14].  We  believe  our  system  can  be  extended  to 
handle  such  generic  types,  but  we  have  not  examined  the  semantic  issues  involved  in  detail. 

5  Monad  Inference 

It  would  be  possible  to  translate  source  programs  into  type-correct  IR  programs  by  simply  assuming 
that  every  expression  falls  into  the  maximally-effectful  monad  (ST  in  our  case).  Every  source  Let 
would  become  a  LetST,  every  variable  and  constant  would  be  coerced  into  ST,  and  every  primitive 
would  return  a  value  in  ST.  Peyton  Jones  et  al.  [11]  suggest  performing  such  a  translation,  and 
then  using  the  monad  laws  (analogous  to  those  in  Figure  8)  and  the  worker-wrapper  transform  [13] 
to  simplify  the  result,  hopefully  resulting  in  some  less-effectful  expression  bindings.  The  main 
objection  to  this  approach  is  that  it  doesn’t  allow  calls  to  unknown  functions  (for  which  worker- 
wrapper  doesn’t  apply)  to  return  non-ST  results.  For  example,  in  the  code  of  Figun'  10.  no  local 
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E  \-v  V  :  Bool  E  ei  e[  :  E\-  62  62  :  M(m,t) 

E  liiv ,61,62)  =>  If  (u, 61,62)  :  M(m,0 

h  61  =4-  6'i:M(mi,ti)  E  +  {x  :  ti}  62  =4^  62  :  M(m2 ,^2)  <  ^^2) 

JS  h  Let(x,6i ,62)  Let(mi,m2,a: :  fi,6'i,62)  :  M(m2,i2) 

Jg  I-  6  =»  6'  :  M(mi  ,t)  mi  <  m2 
E  I- 6  =»  Up(mi,7n2,6')  :  M(m2,<) 

Figure  11:  Selected  translation  rules 

syntactic  analysis  could  discover  that  argument  function  g  is  pure  and  terminating. 

To  obtain  better  control  over  effects,  we  have  developed  an  inference  algorithm  for  computing  the 
minimal  monadic  effect  of  each  subexpression  in  a  program.  Pure,  provably  terminating  expressions 
are  placed  in  ID,  pure  but  potentially  non-terminating  expressions  in  LIFT,  and  so  forth.  The 
algorithm  deals  with  the  latent  monadic  effects  in  functions,  by  recording  them  in  the  result  types. 
As  an  example,  it  produces  the  annotations  shown  in  Figure  10. 

The  input  to  the  algorithm  is  an  untyped  program  in  the  source  language;  the  output  is  a 
program  in  the  typed  IR.  The  algorithm  performs  ordinary  type  inference,  monad  inference,  and 
program  translation  simultaneously.  The  type  inference  aspect  uses  unification  in  a  completely 
conventional  way,  except  that  unifying  the  codomain  mtyps  of  two  arrow  types  requires  unifying 
their  monad  components  as  well  as  their  vtyp  components.  We  therefore  omit  a  detailed  description 
of  vtyp  unification. 

The  translation  aspect  is  also  quite  straightforward.  We  can  turn  each  typing  rule  in  Figure  3 
into  a  translation  rul6  simply  by  recording  the  inferred  type  and  monad  information  in  the  appro¬ 
priate  annotation  slots  of  the  output  and  combining  the  translations  of  subterms  in  the  obvious 
manner.  As  examples,  Figure  11  shows  the  translation  rules  corresponding  to  the  typing  rules  for 
If.  Let.  and  Up.  In  cases  where  a  monad  or  type  appears  in  the  translation  output,  such  as  mi 
and  t\  in  the  Let  rule,  a  fresh  monad  or  type  variable  is  created  and  inserted  in  the  output  for  sub¬ 
sequent  instantiation.  Type  variables  are  instantiated  by  unification;  the  method  of  instantiating 
monad  variables  is  described  below. 

Excluding  the  rule  for  Up,  the  resulting  translation  rules  form  a  deterministic,  syntax-directed 
algorithm  for  translation,  giving  an  output  program  with  exactly  the  same  term  structure  as  the 
input.  However,  the  resulting  program  may  not  obey  the  monadic  constraints  in  the  typing  rules. 
Consider,  for  example,  the  source  term  If  (rc.Val  y, Raise  z).  Since  Val  j/  is  a  value,  its  transla¬ 
tion  is  in  the  ID  monad,  whereas  the  translation  of  Raise  z  must  be  in  the  EXN  or  ST  monad.  To 
glue  together  these  subterm  translations  we  must  insert  a  coercion  around  the  translation  of  the 
Val  term.  The  “translation”  rule  for  Up  is  really  a  coercion  insertion  rule,  which  serves  exactly  this 
purpose;  it  adds  the  necessary  flexibility  to  the  system  to  permit  all  monad  constraints  to  be  met. 
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Since  this  rule  can  be  applied  to  any  subexpression,  it  adds  a  problematic  element  of  nondeter¬ 
minism  to  the  system.  Our  solution  is  to  insert  a  (single)  Up  coercion  around  every  subexpression, 
and  rely  on  a  postprocessing  step  to  remove  unneeded  coercions  using  the  (IdentUp)  rule.  (The 
complete  Standard  ML  code  for  the  translation  routine  is  given  in  Appendix  A.) 

The  final  consideration  is  how  to  record  and  resolve  constraints  on  the  monad  variables.  Such 
constraints  are  introduced  explicitly  by  the  side  conditions  in  the  Let,  Letrec,  and  Up  rules, 
implicitly  by  the  equating  of  monads  from  subexpressions  in  the  If  and  Handle  rules,  and  (even 
more)  implicitly  as  a  result  of  ordinary  unification  of  arrow  types,  which  mention  monads  in  tluMr 
codomains.  The  side-condition  constraints  are  all  inequalities  of  the  form  mi  >  m2,  where  mi  is  a 
monad  variable  and  m2  is  a  variable  or  an  explicit  monad.  The  implicit  constraints  are  all  ('qualities 
mi  =  m2;  for  uniformity,  we  replace  these  by  a  pair  of  inequalities:  mi  >  m2  and  m2  >  m^  We 
collect  constraints  as  a  side-effect  of  the  translation  process,  simply  by  adding  them  to  a  global  list. 

It  is  very  common  for  there  to  be  circularities  among  the  monad  constraints.  To  solve  the 
constraint  system,  we  think  of  it  as  a  directed  graph  with  a  node  for  each  monad  and  monad 
variable,  and  an  edge  from  mi  to  m2  for  each  constraint  mi  >  m2.  We  then  partition  the  graph 
into  its  strongly  connected  components,  and  sort  the  components  into  reverse  topological  order. 
We  process  one  component  at  a  time,  in  this  order.  Since  >  is  a  partial  order,  all  the  nodes  in  a 
given  component  must  be  assigned  the  same  monad;  once  this  has  been  determined,  it  is  assigned 
to  all  the  variables  in  the  component  before  proceeding  to  the  next  component.  To  determine  the 
minimum  possible  correct  assignment  for  a  component,  we  consult  all  the  edges  from  nodes  in  that 
component  to  nodes  outside  the  component;  because  of  the  order  of  processing,  these  nodes  must 
already  have  received  a  monad  assignment.  The  maximum  of  these  assignments  is  the  minimum 
correct  assignment  for  this  component.  If  there  are  no  such  edges,  the  minimum  correct  assignment 
is  ID.  This  algorithm  is  linear  in  the  number  of  constraints,  and  hence  in  the  size  of  the  source 
program. 

To  summarize,  we  perform  monad  inference  by  first  translating  the  source  program  into  a 
form  padded  with  coercion  operators  and  annotated  with  monad  variables,  meanwhile  collecting 
constraints  on  these  variables,  and  then  solving  the  resulting  constraint  system  to  fill  in  the  variables 
in  the  translated  program.  The  resulting  program  will  contain  many  null  coercions  of  the  form 
Up(m,m,e);  these  can  be  removed  by  a  single  postprocessing  pass. 

Our  algorithm  is  very  similar  to  a  that  of  Talpin  and  Jouvelot  [14],  restricted  to  a  monomor- 
phic  source  language.  Both  algorithms  generate  essentially  the  same  sets  of  constraints.  Talpin 
and  Jouvelot  apparently  solve  the  constraints  using  unification;  the  full  details  of  the  unification 
algorithm  are  not  given.  It  would  be  natural  to  extend  our  algorithm  to  handle  Hindley-Milner 
polymorphism  for  both  types  and  monads  in  the  Talpin-Jouvelot  style.  The  idea  is  to  generalize  all 
free  type  and  effect  variables  in  let  definitions  and  allow  different  uses  of  the  bound  identifier  to 
instantiate  these  in  different  ways.  In  particular,  parametric  functions  like  map  could  be  used  with 
many  different  monads,  without  one  use  “polluting”  the  others.  (Note  that  functions  not  wholly 
parametric  in  their  effects  would  place  a  minimum  effect  bound  on  permissible  instantiations  for 
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monad  variables.)  This  form  of  monad  polymorphism  seems  desirable  even  in  the  absence  of  type 
polymorphism  (e.g.,  resulting  from  explicit  monomorphization  [17]). 

In  whole-program  compilation,  the  complete  set  of  effect  instantiations  would  be  known.  This 
set  could  be  used  to  put  an  upper  effect  bound  on  monad  variables  within  definition  bodies  and 
hence  determine  what  transformations  are  legal  there.  Alternatively,  it  could  be  used  to  guide  the 
generation  of  effect-specific  clones  as  suggested  in  the  previous  section.  Generalization  of  effect 
variables  would  also  support  safe  separate  compilation,  though  drawbacks  would  remain:  in  the 
absence  of  complete  information  about  uses  of  a  definition,  any  variable  monad  in  the  body  of 
the  definition  must  be  treated  as  ST,  the  most  “effectful”  monad,  for  the  purposes  of  performing 
transformations  within  the  body. 

6  Status  and  Conclusions 

Wti  believe  our  approach  has  the  merits  of  simplicity  and  reasonable  effectiveness.  We  have  im¬ 
plemented  the  monad  inference  algorithm  for  an  extended  version  of  the  IR  described  here,  which 
supports  full  Standard  ML;  we  are  currently  measuring  its  effectiveness  using  the  backend  of  our 
RML  compiler  system  [17]. 
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Appendix:  Code  for  monadic  inference  translation 

fun  unify ^typ  (M(nia,ta)  ,M(mb ,tb) )  = 

(bound^monad(ma,mb) ;  bound^monad(mb ,ma) ;  unif y«vtyp(ta,tb) ) 

and  unify.vtyp  (a: vtyp,b: vtyp)  =  . . .unify.typ. . . 

and  bound_monad  (ma : monad, mb : monad)  =  ... 

fun  type^value  (enviid  ->  vtyp)  (v rvalue)  :  typ  =  ... 

fun  wrap(e  :  exp,t  as  M(m,vt))  :  exp  *  typ  = 
let  val  =  new.monadO 
in  bound„monad(m\m)  ; 

(Up(m,m’ ,e) ,M(m’ ,vt)) 

end 


fun  translate^exp  (enviid  ->  vtyp)  (e:  exp)  :  exp  *  typ  = 
case  e  of 

Source. Val  v  =>  let  val  t’  =  type.value  env  v 
in  wrap(Val  v,  M(ID,tO) 
end 

I  Source. Abs(x,e)  => 

let  val  t  =  new^vtypO 

val  (e^f)  =  translate.exp  (extend  env  (x,t))  e 
in  wrap(Abs ( (x,t) ,e*) ,M(ID,Arrow(t ,t 0)) 
end 

1  Source. App(vl,v2)  => 

let  val  t  =  new_vtyp()  and  u  =  new_typ() 
val  tl  =  type. value  env  vl 
val  t2  =  type. value  env  v2 
in  unify.vtyp (Arrow (t  ,u) ,  tlO; 
unify_vtyp(t  ,t2  0 ; 
wrap(App(vl,v2) ,u) 

end 
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I  Source. If (v, el, e2)  => 

let  val  t’  =  type^value  env  v 

val  (el^tlO  ==  translate.exp  env  el 
val  (e2\t2*)  =  translate^exp  env  e2 
in  unify^vtypCt ^ jBool) ; 
unify_typ(tl  ^  ,t2  0  ; 
wrap  (If  (v,  el  ^  ,e2  0  ,tlO 

end 

1  Source. Let (x, el, e2)  => 

let  val  (el\tl^  as  M(inl',vtlO)  =  translate.exp  env  el 
val  (e2’ ,t2’  as  M(m2’,vt2  0)  = 

translate_exp  (extend  env  (x,vtlO)  ©2 
in  bound^monad  (m2  ^ ,  ml  0  ; 

wrap (Let (ml ^ ,m2^ , (x,vtl 0 , el ’ »e2' ) , t2  0 

end 

I  Source.Letrec(f ,x,el,e2)  => 

let  val  t  =  new_vtyp()  and  u  as  M(um,uvt)  =  new.typO 
val  (el ^  ,tl 0  = 

translate^exp  (extend  (extend  env  (f ,Arrow(t ,u) ) )  (x,t))  el 
val  (e2*,t2’)  =  translate.exp  (extend  env  (f , Arrow (t ,u) ) )  e2 
in  unify.typ  (tl\u); 
bound_monad(um,LIFT) ; 

wrap(Letrec((f , Arrow (t,u)) ,  (x,t)  ,elSe20  ,  t20 

end 

I  Source. Tuple  vs  => 

let  val  ts  =  map  (type. value  env)  vs 
in  wrapduple  vs,M(ID,Tup  ts)) 
end 

I  Sour ce. Pro j (i,n,v)  => 

let  val  t  ^  =  type. value  env  v 

fun  upto  (x,y)  =  if  x  >  y  then  []  else  x::(upto  (x+l,y)) 
val  vts  =  map  new.vtyp  (upto  (0,n-l)) 

val  t  =  List .nth (vts, i)  handle  Subscript  =>  raise  Bad  "Proj  index" 
in  unify_vtyp(t * ,Tup(vts)) ; 
wrap(Proj (i,n,v) ,M(ID,t)) 

end 

I  Source. Raise  (v)  => 

let  val  vt  =  new.vtypO 
val  t  =  M(EXN,vt) 
val  t’  =  type. value  env  v 
in  unify. vtyp  (t’,Exn); 
wrap(Raise(t , v) ,t) 

end 

I  Source. Handle (e,v)  => 

let  val  u  as  M(um,uvt)  =  new.typO 
val  (e^tO  =  translate.exp  env  e 
val  vt’  =  type.value  env  v 
in  unify_typ(u,t’) ; 

bound.monad(um,EXN) ; 
unify.vtyp(vt ’ , Arrow (Exn,t ’)) ; 
wrap (Hemdle (m , e  \  v) , t ’ ) 

end 
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